[[abstract]]Address generation for compiling programs, written in HPF, to executable SPMD code is an important and necessary phase in a parallelizing compiler. This paper presents an efficient compilation technique to generate the local memory access sequences for block-cyclically distributed array references with affine subscripts in data-parallel programs. For the memory accesses of an array reference with affine subscript within a two-nested loop, there exist repetitive patterns both at the outer and inner loops. We use tables to record the memory accesses of repetitive patterns. According to these tables, a new start-computation algorithm is proposed to compute the starting elements on a processor for each outer loop iteration. The comp...
Dataflow-based fine-grain parallel data-structures provide high-level abstraction to easily write pr...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
An important class of problems used widely in both the embedded systems and scientific domains perfo...
[[abstract]]This paper presents an efficient compilation technique to generate the local memory acce...
Data-parallel languages, such as High Performance Fortran, are designed to make programming of distr...
An important research topic is parallelizing of compilers to generate local memory access sequences ...
Arrays are mapped to processors through a two-step process---alignment followed by distribution---in...
scratch pad memory, affine reference This paper considers compiler management of fast, local memorie...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
This paper presents compilation techniques used to compress holes, which are caused by the nonunit a...
We present new techniques for compilation of arbitrarily nested loops with affine dependences for di...
[[abstract]]This paper presents compilation techniques used to compress holes, which are caused by t...
Development of scalable application codes requires an understanding and exploitation of the locality...
[[abstract]]An increasing number of programming languages, such as Fortran 90, HPF, and APL, provide...
This paper presents compilation techniques to compress holes, which are caused by the non-unit align...
Dataflow-based fine-grain parallel data-structures provide high-level abstraction to easily write pr...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
An important class of problems used widely in both the embedded systems and scientific domains perfo...
[[abstract]]This paper presents an efficient compilation technique to generate the local memory acce...
Data-parallel languages, such as High Performance Fortran, are designed to make programming of distr...
An important research topic is parallelizing of compilers to generate local memory access sequences ...
Arrays are mapped to processors through a two-step process---alignment followed by distribution---in...
scratch pad memory, affine reference This paper considers compiler management of fast, local memorie...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
This paper presents compilation techniques used to compress holes, which are caused by the nonunit a...
We present new techniques for compilation of arbitrarily nested loops with affine dependences for di...
[[abstract]]This paper presents compilation techniques used to compress holes, which are caused by t...
Development of scalable application codes requires an understanding and exploitation of the locality...
[[abstract]]An increasing number of programming languages, such as Fortran 90, HPF, and APL, provide...
This paper presents compilation techniques to compress holes, which are caused by the non-unit align...
Dataflow-based fine-grain parallel data-structures provide high-level abstraction to easily write pr...
The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Paral...
An important class of problems used widely in both the embedded systems and scientific domains perfo...