This papers presents an approach to statement-level independent partitioning of uniform recurrences, i.e., loops with constant dependence distance vectors. Uniform recurrences may be partitioned into independent subsets---of the set of all statement instances--- that require no communication or synchronization on a multiprocessor. Therefore, independent partitioning is highly efficient and desirable. This paper presents a method to partition uniform recurrences using statement-level affine schedules, and an algorithm for code generation. We consider a statement instance as a basic unit that can be allocated to a processor, in contrast to existing methods that use an iteration instance. Using this approach, we not only find maximal independe...
In this paper we present substantially improved thread partitioning algorithms for modern implicitly...
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) depen...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...
Non-uniform distance loop dependences are a known obstacle to find parallel iterations. To find the ...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
The paper is concerned with the uniformization of a system of affine recurrence equations. This tran...
This paper addresses the problems of communication -free partitions of statement-iterations of neste...
Introduction This short report is a companion paper of the research report [1]. It cannot be read w...
A methodology for partitioning and mapping of arbitrary uniform recurrence equations (UREs) expresse...
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) depen...
Abstract In this paper, an approach to the problem of exploiting parallelism within nested loops is ...
This paper presents an approach to software pipelining of nested loops. While several papers have ad...
Executing a program in parallel machines needs not only to find sufficient parallelism in a program,...
In this paper we present substantially improved thread partitioning algorithms for modern implicitly...
A methodology for partitioning and mapping of arbitrary uniform recurrence equations (UREs) expresse...
In this paper we present substantially improved thread partitioning algorithms for modern implicitly...
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) depen...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...
Non-uniform distance loop dependences are a known obstacle to find parallel iterations. To find the ...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
The paper is concerned with the uniformization of a system of affine recurrence equations. This tran...
This paper addresses the problems of communication -free partitions of statement-iterations of neste...
Introduction This short report is a companion paper of the research report [1]. It cannot be read w...
A methodology for partitioning and mapping of arbitrary uniform recurrence equations (UREs) expresse...
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) depen...
Abstract In this paper, an approach to the problem of exploiting parallelism within nested loops is ...
This paper presents an approach to software pipelining of nested loops. While several papers have ad...
Executing a program in parallel machines needs not only to find sufficient parallelism in a program,...
In this paper we present substantially improved thread partitioning algorithms for modern implicitly...
A methodology for partitioning and mapping of arbitrary uniform recurrence equations (UREs) expresse...
In this paper we present substantially improved thread partitioning algorithms for modern implicitly...
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) depen...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...