Compilation of parallel loops is one of the most important parts in parallel compilation and optimization. This paper mainly discusses the key techniques during the compilation implementation of parallel loops, based on the uniform partition schemes. It includes techniques in local array index generating, loop space reconstructing, communication detecting and organizing and data dependence disposing. The efficiency of this implementation has been proved by lots of experiments. The p_HPF compiler which adopts this compiling framework can obtain good speedups and efficiencies. The compiler has been applied in many fields, particularly the field of petroleum exploration.Compilation of parallel loops is one of the most important parts in parall...
[[abstract]]The main function of parallelizing compilers is to analyze sequential programs, in parti...
Loops in scientific and engineering applications provide a rich source of parallelism. In order to o...
Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and glob...
Computation partition is one of the most important problems in parallel compilation and optimization...
Abstract In this paper, an approach to the problem of exploiting parallelism within nested loops is ...
In the data parallel programming style the user usually specifies the data parallelism explicitly so...
Over the past two decades tremendous progress has been made in both the design of parallel architect...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
Abstract. Loop fusion is a program transformation that merges multi-ple loops into one. It is eectiv...
This paper first discusses three types of parallel computing models in cluster environment, namely G...
In this paper, we have presented the design and evalu-ation of a compiler system, called APE, f o r ...
International audienceIn this paper, we present original techniques for the generation and the effic...
In the data parallel programming style the user usually speci es the data parallelism explicitly so ...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Abstract. This paper presents a compilation technique that performs automatic parallelization of can...
[[abstract]]The main function of parallelizing compilers is to analyze sequential programs, in parti...
Loops in scientific and engineering applications provide a rich source of parallelism. In order to o...
Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and glob...
Computation partition is one of the most important problems in parallel compilation and optimization...
Abstract In this paper, an approach to the problem of exploiting parallelism within nested loops is ...
In the data parallel programming style the user usually specifies the data parallelism explicitly so...
Over the past two decades tremendous progress has been made in both the design of parallel architect...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
Abstract. Loop fusion is a program transformation that merges multi-ple loops into one. It is eectiv...
This paper first discusses three types of parallel computing models in cluster environment, namely G...
In this paper, we have presented the design and evalu-ation of a compiler system, called APE, f o r ...
International audienceIn this paper, we present original techniques for the generation and the effic...
In the data parallel programming style the user usually speci es the data parallelism explicitly so ...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Abstract. This paper presents a compilation technique that performs automatic parallelization of can...
[[abstract]]The main function of parallelizing compilers is to analyze sequential programs, in parti...
Loops in scientific and engineering applications provide a rich source of parallelism. In order to o...
Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and glob...