Using parallel processing systems to compute scientific applications is one of the most common solutions for achieving more efficient computing performance. In some applications such as fluid mechanics, structural analysis, solid state simulations, the dependencies across iterations (loop-carried dependencies) of the computation of array elements may be constants (uniform) or functions of array indices (non-uniform). Traditional scheduling can efficiently generate an optimized schedule for applications with any uniform dependencies. However, non-uniform dependencies of array elements in those applications may cause the scheduler to produce an inefficient result. This paper presents a new scheduling methodology for both uniform and non-unifo...
This paper proposes an efficient run-time system to schedule general nested loops on multiprocessors...
The parallelization of complex, irregular scientific applications with various computational require...
Software pipelining is a loop scheduling technique that extracts parallelism from loops by overlappi...
Using parallel processing systems to execute scientific applications is one of the most common solut...
We consider the resource-constrained scheduling of loops with inter-iteration dependencies. A loop i...
Loop pipelining is a scheduling technique widely used to improve the performance of systems running ...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
It is extremely difficult to parallelize DOACROSS loops with non-uniform loop-carried dependences. I...
We consider the problem of scheduling parallel loops that are characterized by highly varying execut...
One of the biggest problems in parallel processing is to obtain a good schedule without having a kno...
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in compu...
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in compu...
In this paper, we survey loop parallelization algorithms, analyzing the dependence representations t...
In this paper we present an efficient template for the implementation on distributed-memory multipro...
Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of la...
This paper proposes an efficient run-time system to schedule general nested loops on multiprocessors...
The parallelization of complex, irregular scientific applications with various computational require...
Software pipelining is a loop scheduling technique that extracts parallelism from loops by overlappi...
Using parallel processing systems to execute scientific applications is one of the most common solut...
We consider the resource-constrained scheduling of loops with inter-iteration dependencies. A loop i...
Loop pipelining is a scheduling technique widely used to improve the performance of systems running ...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
It is extremely difficult to parallelize DOACROSS loops with non-uniform loop-carried dependences. I...
We consider the problem of scheduling parallel loops that are characterized by highly varying execut...
One of the biggest problems in parallel processing is to obtain a good schedule without having a kno...
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in compu...
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in compu...
In this paper, we survey loop parallelization algorithms, analyzing the dependence representations t...
In this paper we present an efficient template for the implementation on distributed-memory multipro...
Scientific applications, such as N-body, Monte Carlo, and computational fluid dynamics consist of la...
This paper proposes an efficient run-time system to schedule general nested loops on multiprocessors...
The parallelization of complex, irregular scientific applications with various computational require...
Software pipelining is a loop scheduling technique that extracts parallelism from loops by overlappi...