Abstract. We prove an analogue of Brent’s lemma for BSP-like parallel machines featuring a hierarchical structure for both the interconnection and the memory. Specifically, for these machines we present a uniform scheme to simulate any computation designed for v processors on a v ′-processor configuration with v ′ ≤ v and the same overall memory size. For a wide class of computations the simulation exhibits optimal O (v/v ′ ) slowdown. The simulation strategy aims at translating communication locality into temporal locality. As an important special case (v ′ = 1), our simulation can be employed to obtain efficient hierarchyconscious sequential algorithms from efficient fine-grained ones.
In a recent paper (SPAA'01), we have established that the Pipelined Hierarchical Random Access Machi...
Abstract We present work-preserving emulations with small slowdown between LogP and two other parall...
We present a general deterministic scheme to implement a shared memory abstraction on any distribute...
We prove an analogue of Brent's lemma for BSP-like parallel machines featuring a hierarchical struct...
In this work, we show that the submachine locality exposed by hierarchical bulksynchronous computati...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
This chapter describes the Decomposable Bulk Synchrounous Parallel (D-BSP) model of computation, as ...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
The memories of real life computers usually have a hierarchical structure with levels like registers...
Abstract. The power of shared-memory in models of parallel computation is studied, and a novel distr...
We must consider communication Algorithms have two kinds of costs: computation and communication mov...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
This paper explores the relation between the structured parallelism exposed by the Decomposable BSP ...
AbstractWe study two classes of unbounded fan-in parallel computation, the standard one, based on un...
This thesis outlines a cost-effective multiprocessor architecture that takes into consideration the ...
In a recent paper (SPAA'01), we have established that the Pipelined Hierarchical Random Access Machi...
Abstract We present work-preserving emulations with small slowdown between LogP and two other parall...
We present a general deterministic scheme to implement a shared memory abstraction on any distribute...
We prove an analogue of Brent's lemma for BSP-like parallel machines featuring a hierarchical struct...
In this work, we show that the submachine locality exposed by hierarchical bulksynchronous computati...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
This chapter describes the Decomposable Bulk Synchrounous Parallel (D-BSP) model of computation, as ...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
The memories of real life computers usually have a hierarchical structure with levels like registers...
Abstract. The power of shared-memory in models of parallel computation is studied, and a novel distr...
We must consider communication Algorithms have two kinds of costs: computation and communication mov...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
This paper explores the relation between the structured parallelism exposed by the Decomposable BSP ...
AbstractWe study two classes of unbounded fan-in parallel computation, the standard one, based on un...
This thesis outlines a cost-effective multiprocessor architecture that takes into consideration the ...
In a recent paper (SPAA'01), we have established that the Pipelined Hierarchical Random Access Machi...
Abstract We present work-preserving emulations with small slowdown between LogP and two other parall...
We present a general deterministic scheme to implement a shared memory abstraction on any distribute...