Nested data-parallelism (NDP) is a declarative style for program-ming irregular parallel applications. NDP languages provide lan-guage features favoring the NDP style, efficient compilation of NDP programs, and various common NDP operations like paral-lel maps, filters, and sum-like reductions. In this paper, we describe the implementation of NDP in Parallel ML (PML), part of the Man-ticore project. Managing the parallel decomposition of work is one of the main challenges of implementing NDP. If the decomposi-tion creates too many small chunks of work, performance will be eroded by too much parallel overhead. If, on the other hand, there are too few large chunks of work, there will be too much sequential processing and processors will sit i...
A set of communication operations is defined, which allows a form of task parallelism to be achieved...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...
Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a ...
Data parallelislm is one of the more successful efforts to introduce explicit parallelism to high le...
[[abstract]]A systematic procedure for designing pipelined data-parallel algorithms that are suitabl...
We present performance evaluations of parallel-for loop with work\ud stealing technique. The paralle...
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language...
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language...
We present a new parallel implementation of lazy ML. Our scheme is a direct extension of the G-machi...
We describe in this paper a new approach to parallelize branch-and-bound on a certain number of proc...
The tree-layout problem is to compute the coordinates of nodes of a tree so that the tree, when draw...
On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (...
Different parallelization methods for irregular reductions on shared memory multiprocessors have bee...
This paper presents a new technique to parallelize non-vectorizable loosely nested loops. Loosely ne...
A set of communication operations is defined, which allows a form of task parallelism to be achieved...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...
Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a ...
Data parallelislm is one of the more successful efforts to introduce explicit parallelism to high le...
[[abstract]]A systematic procedure for designing pipelined data-parallel algorithms that are suitabl...
We present performance evaluations of parallel-for loop with work\ud stealing technique. The paralle...
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language...
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language...
We present a new parallel implementation of lazy ML. Our scheme is a direct extension of the G-machi...
We describe in this paper a new approach to parallelize branch-and-bound on a certain number of proc...
The tree-layout problem is to compute the coordinates of nodes of a tree so that the tree, when draw...
On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (...
Different parallelization methods for irregular reductions on shared memory multiprocessors have bee...
This paper presents a new technique to parallelize non-vectorizable loosely nested loops. Loosely ne...
A set of communication operations is defined, which allows a form of task parallelism to be achieved...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
This paper presents a new technique to parallelize nested loops at the statement level. It transform...