Abstract—Parallelization and locality optimization of affine loop nests has been successfully addressed for shared-memory machines. However, many large-scale simulation applications must be executed in a distributed-memory environment, and use irregular/sparse computations where the control-flow and array-access patterns are data-dependent. In this paper, we propose an approach for effective parallel exe-cution of a class of irregular loop computations in a distributed-memory environment, using a combination of static and run-time analysis. We discuss algorithms that analyze sequential code to generate an inspector and an executor. The inspector captures the data-dependent behavior of the computation in parallel and without requiring comple...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
Enhancing high performance computing on distributed computers asks for a programming environment to ...
This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Compute...
Abstract. A loop with irregular assignment computations contains loopcarried output data dependences...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
[[abstract]]A run-time technique based on the inspector-executor scheme is proposed in this paper to...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
AbstractSpeculative parallelization is a classic strategy for automatically parallelizing codes that...
When the inter-iteration dependency pattern of the iterations of a loop cannot be determined statica...
Reordering of data is becoming more and more significant in order to achieve a higher performance in...
This is a post-peer-review, pre-copyedit version of an article published. The final authenticated ve...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
Enhancing high performance computing on distributed computers asks for a programming environment to ...
This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Compute...
Abstract. A loop with irregular assignment computations contains loopcarried output data dependences...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
[[abstract]]A run-time technique based on the inspector-executor scheme is proposed in this paper to...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
AbstractSpeculative parallelization is a classic strategy for automatically parallelizing codes that...
When the inter-iteration dependency pattern of the iterations of a loop cannot be determined statica...
Reordering of data is becoming more and more significant in order to achieve a higher performance in...
This is a post-peer-review, pre-copyedit version of an article published. The final authenticated ve...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
Enhancing high performance computing on distributed computers asks for a programming environment to ...