Abstract. This paper describes an approach to synthesis of efficient out-of-core code for a class of imperfectly nested loops that represent tensor contraction computations. Tensor contraction expressions arise in many accurate computational models of electronic structure. The developed approach combines loop fusion with loop tiling and uses a performance-model driven approach to loop tiling for the generation of out-of-core code. Experimental measurements are provided that show a good match with model-based predictions and demonstrate the effectiveness of the proposed algorithm.
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kerne...
Abstract. Complex tensor contraction expressions arise in accurate electronic structure models in qu...
Abstract. This paper describes an approach to synthesis of efficient out-of-core code for a class of...
This paper describes an approach to synthesis of efficient out-of-core code for a class of imperfec...
We address the problem of efficient out-of-core code generation for a special class of imperfectly n...
We address the problem of efficient out-of-core code generation for a special class of imperfectly n...
Abstract. The goal of our project is the development of a program synthesis system to facilitate the...
This paper discusses a program synthesis system to facil-itate the generation of high-performance pa...
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor co...
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor co...
This paper presents a technique for memory optimization for a class of computations that arises in t...
Abstract Most scientific programs have large input and output data sets that require out-of-core pro...
Complex tensor contraction expressions arise in accurate electronic structure models in quantum chem...
This paper presents compiler algorithms to optimize out-of-core programs. These algorithms consider ...
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kerne...
Abstract. Complex tensor contraction expressions arise in accurate electronic structure models in qu...
Abstract. This paper describes an approach to synthesis of efficient out-of-core code for a class of...
This paper describes an approach to synthesis of efficient out-of-core code for a class of imperfec...
We address the problem of efficient out-of-core code generation for a special class of imperfectly n...
We address the problem of efficient out-of-core code generation for a special class of imperfectly n...
Abstract. The goal of our project is the development of a program synthesis system to facilitate the...
This paper discusses a program synthesis system to facil-itate the generation of high-performance pa...
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor co...
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor co...
This paper presents a technique for memory optimization for a class of computations that arises in t...
Abstract Most scientific programs have large input and output data sets that require out-of-core pro...
Complex tensor contraction expressions arise in accurate electronic structure models in quantum chem...
This paper presents compiler algorithms to optimize out-of-core programs. These algorithms consider ...
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kerne...
Abstract. Complex tensor contraction expressions arise in accurate electronic structure models in qu...