International audienceThe freedom to reorder computations involving associative operators has been widely recognized and exploited in designing parallel algorithms and to a more limited extent in optimizing compilers. In this paper, we develop a novel framework utilizing the associativity and commutativity of operations in regular loop computations to enhance register reuse. Stencils represent a particular class of important computations where the optimization framework can be applied to enhance performance. We show how stencil operations can be implemented to better exploit register reuse and reduce load/stores. We develop a multi-dimensional retiming formalism to characterize the space of valid implementations in conjunction with other pr...
[[abstract]]In this paper, we propose a compilation scheme to analyze and exploit the implicit reuse...
International audienceThe generation of efficient sequential code for synchronous data-flow language...
This paper presents a new technique for the problem of allocating and assigning registers to variabl...
International audienceThe freedom to reorder computations involving associative operators has been w...
The freedom to reorder computations involving associative operators has been widely recognized and e...
International audienceRegister allocation is generally considered a practically solved problem. For ...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
. In the context of developing a compiler for a Alpha, a functional data-parallel language based on ...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Loop fusion is a reordering transformation that merges multiple loops into a single loop. It can inc...
International audienceStorage mapping optimization is a flexible approach to folding array dimension...
[[abstract]]In this paper, we propose a compilation scheme to analyze and exploit the implicit reuse...
International audienceThe generation of efficient sequential code for synchronous data-flow language...
This paper presents a new technique for the problem of allocating and assigning registers to variabl...
International audienceThe freedom to reorder computations involving associative operators has been w...
The freedom to reorder computations involving associative operators has been widely recognized and e...
International audienceRegister allocation is generally considered a practically solved problem. For ...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
Application codes reliably achieve performance far less than the advertised capabilities of existing...
This paper describes a new technique for optimizing serial and parallel stencil- and stencil-like op...
The key common bottleneck in most stencil codes is data movement, and prior research has shown that ...
. In the context of developing a compiler for a Alpha, a functional data-parallel language based on ...
Abstract Performance optimization of stencil computations has beenwidely studied in the literature, ...
Loop fusion is a reordering transformation that merges multiple loops into a single loop. It can inc...
International audienceStorage mapping optimization is a flexible approach to folding array dimension...
[[abstract]]In this paper, we propose a compilation scheme to analyze and exploit the implicit reuse...
International audienceThe generation of efficient sequential code for synchronous data-flow language...
This paper presents a new technique for the problem of allocating and assigning registers to variabl...