Communication overhead in multiprocessor systems, as exemplified by cache coherency traffic and global memory access, has a substantial impact on multiprocessor performance. This thesis develops compile-time techniques to reduce the overhead of interprocessor communication for iterative data-parallel loops. These techniques exploit machine-specific information to minimize communication overhead, thus eliminating the need for a user to tune a program for each new multiprocessor. Such techniques are a necessary step toward developing software to support portable parallel programs. Adaptive Data Partitioning (ADP) reduces the execution time of parallel programs by minimizing interprocessor communication for iterative data-parallel loops with n...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...
[[abstract]]Intensive scientific algorithms can usually be formulated as nested loops which are the ...
Scalable shared-memory multiprocessor systems are typically NUMA (nonuniform memory access) machines...
this paper we will present a solution to the problem of determining loop and data partitions automat...
Data-parallel languages allow programmers to use the familiar machine-independent programming style ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
[[abstract]]In distributed memory multicomputers, local memory accesses are much faster than those i...
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
This paper addresses the problem of partitioning data for distributed memory machines (multicomputer...
Shared-memory multiprocessor systems can achieve high performance levels when appropriate work paral...
220 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1986.This dissertation discusses s...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
In order to reduce remote memory accesses on CC-NUMA multiprocessors, we present an interprocedural ...
170 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1986.Since the mid 1970's, vector ...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...
[[abstract]]Intensive scientific algorithms can usually be formulated as nested loops which are the ...
Scalable shared-memory multiprocessor systems are typically NUMA (nonuniform memory access) machines...
this paper we will present a solution to the problem of determining loop and data partitions automat...
Data-parallel languages allow programmers to use the familiar machine-independent programming style ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
[[abstract]]In distributed memory multicomputers, local memory accesses are much faster than those i...
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
This paper addresses the problem of partitioning data for distributed memory machines (multicomputer...
Shared-memory multiprocessor systems can achieve high performance levels when appropriate work paral...
220 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1986.This dissertation discusses s...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
In order to reduce remote memory accesses on CC-NUMA multiprocessors, we present an interprocedural ...
170 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1986.Since the mid 1970's, vector ...
Loops are the main source of parallelism in scientific programs. Hence, several techniques were dev...
Distributed-memory multicomputers, such as the Intel iPSC/860, the Intel Paragon, the IBM SP-1 /SP-2...
[[abstract]]Intensive scientific algorithms can usually be formulated as nested loops which are the ...