This paper outlines two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
Automatic Global Data Partitioning for Distributed Memory Machines (DMMs) is a difficult problem. Di...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...
Outlined here are two methods which we believe will play an important role in any distributed memory...
In scalable multiprocessor systems, high performance demands that computational load be balanced eve...
We developed a dataflow framework which provides a basis for rigorously defining strategies to make ...
A compiler and runtime support mechanism is described and demonstrated. The methods presented are ca...
In recent years, distributed memory parallel machines have been widely recognized as the most likely...
In this paper, we develop an automatic compile-time computation and data decomposition technique for...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
This paper describes two new ideas by which an HPF compiler can deal with irregular computations eff...
This paper describes two new ideas by which an HPF compiler can deal with irregular computations eff...
In this paper, we describe two new ideas by which HPF compiler can deal with irregular computations ...
Data-parallel languages, such as H scIGH P scERFORMANCE F scORTRAN or F scORTRAN D, provide a machin...
Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and glob...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
Automatic Global Data Partitioning for Distributed Memory Machines (DMMs) is a difficult problem. Di...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...
Outlined here are two methods which we believe will play an important role in any distributed memory...
In scalable multiprocessor systems, high performance demands that computational load be balanced eve...
We developed a dataflow framework which provides a basis for rigorously defining strategies to make ...
A compiler and runtime support mechanism is described and demonstrated. The methods presented are ca...
In recent years, distributed memory parallel machines have been widely recognized as the most likely...
In this paper, we develop an automatic compile-time computation and data decomposition technique for...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
This paper describes two new ideas by which an HPF compiler can deal with irregular computations eff...
This paper describes two new ideas by which an HPF compiler can deal with irregular computations eff...
In this paper, we describe two new ideas by which HPF compiler can deal with irregular computations ...
Data-parallel languages, such as H scIGH P scERFORMANCE F scORTRAN or F scORTRAN D, provide a machin...
Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and glob...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
Automatic Global Data Partitioning for Distributed Memory Machines (DMMs) is a difficult problem. Di...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...