This paper investigates the design of parallel algorithmic strategies that address the efficient use of both, memory hierarchies within each processor and a multilevel clustered structure of the interconnection between processors. In the past, these phenomena have usually been addressed separately. This paper is a first step towards parallel algorithmic strategies which address both at the same time. As a case study, we investigate the distribution sweeping method which has been very effective for the design of external memory algorithms for computational geometry problems. We present a novel method for parallel distribution sweeping on a clustered parallel machine with hierarchical local memories, showing that it yields optimal computation...
In this paper, we give new techniques for designing ecient algorithms for computational geometry pro...
The 2011 IEEE International Parallel & Distributed Processing Symposium (IPDPS), Anchorage, Alaska, ...
Processor arrays can be used as accelerators for a plenty of data flow-dominant applications. The ex...
ESA 2013: 21st Annual European Symposium Sophia Antipolis, France, 2-4 September 2013In this paper, ...
The memories of real life computers usually have a hierarchical structure with levels like registers...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
In this paper we introduce parallel versions of two hierarchical memory models and give optimal algo...
The task-to-processor mapping problem is addressed in the context of a local-memory multiprocessor w...
This dissertation presents optimization techniques for efficient data parallel formulation/implement...
This article focuses on principles for the design of efficient parallel algorithms for distributed m...
Irregular problems arise in many areas of computational physics and other scientific applications. A...
We studyscalable parallel computational geometry algorithms for the coarse grained multicomputer mod...
A summary of the results achieved in the paper "Optimal Randomized Parallel Algorithms for Comp...
The design of algorithms exhibiting a high degree of tem-poral and spatial locality of reference is ...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
In this paper, we give new techniques for designing ecient algorithms for computational geometry pro...
The 2011 IEEE International Parallel & Distributed Processing Symposium (IPDPS), Anchorage, Alaska, ...
Processor arrays can be used as accelerators for a plenty of data flow-dominant applications. The ex...
ESA 2013: 21st Annual European Symposium Sophia Antipolis, France, 2-4 September 2013In this paper, ...
The memories of real life computers usually have a hierarchical structure with levels like registers...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
In this paper we introduce parallel versions of two hierarchical memory models and give optimal algo...
The task-to-processor mapping problem is addressed in the context of a local-memory multiprocessor w...
This dissertation presents optimization techniques for efficient data parallel formulation/implement...
This article focuses on principles for the design of efficient parallel algorithms for distributed m...
Irregular problems arise in many areas of computational physics and other scientific applications. A...
We studyscalable parallel computational geometry algorithms for the coarse grained multicomputer mod...
A summary of the results achieved in the paper "Optimal Randomized Parallel Algorithms for Comp...
The design of algorithms exhibiting a high degree of tem-poral and spatial locality of reference is ...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
In this paper, we give new techniques for designing ecient algorithms for computational geometry pro...
The 2011 IEEE International Parallel & Distributed Processing Symposium (IPDPS), Anchorage, Alaska, ...
Processor arrays can be used as accelerators for a plenty of data flow-dominant applications. The ex...