In this work, we show that the submachine locality exposed by hierarchical bulk-synchronous computations can be efficiently turned into locality of reference on arbitrarily deep hierarchies. Specifically, we develop efficient schemes to simulate parallel programs written for the Decomposable BSP (a BSP variant which features a hierarchical decomposition into submachines) on the sequential Hierarchical Memory Model (HMM), which rewards the exploitation of temporal locality, and on its extension with block transfer, the BT model, which also rewards the exploitation of spatial locality. The simulations yield good hierarchy-conscious sequential algorithms from parallel ones, and provide evidence of the strict relation between submachine localit...
This paper investigates the design of parallel algorithmic strategies that address the efficient use...
In order to mitigate the impact of the constantly widening gap between processor speed and main memo...
We introduce a physical analogy to describe problems and high-performance concurrent computers on wh...
In this work, we show that the submachine locality exposed by hierarchical bulksynchronous computati...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
Abstract. We prove an analogue of Brent’s lemma for BSP-like parallel machines featuring a hierarchi...
We prove an analogue of Brent's lemma for BSP-like parallel machines featuring a hierarchical struct...
This chapter describes the Decomposable Bulk Synchrounous Parallel (D-BSP) model of computation, as ...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
The memories of real life computers usually have a hierarchical structure with levels like registers...
This paper explores the relation between the structured parallelism exposed by the Decomposable BSP ...
This paper formulates and investigates the question of whether a given algorithm can be coded in a w...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
The evolution of computing technology towards the ultimate physical limits makes communication the d...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
This paper investigates the design of parallel algorithmic strategies that address the efficient use...
In order to mitigate the impact of the constantly widening gap between processor speed and main memo...
We introduce a physical analogy to describe problems and high-performance concurrent computers on wh...
In this work, we show that the submachine locality exposed by hierarchical bulksynchronous computati...
The design of algorithms exhibiting a high degree of temporal and spatial locality of reference is c...
Abstract. We prove an analogue of Brent’s lemma for BSP-like parallel machines featuring a hierarchi...
We prove an analogue of Brent's lemma for BSP-like parallel machines featuring a hierarchical struct...
This chapter describes the Decomposable Bulk Synchrounous Parallel (D-BSP) model of computation, as ...
Processors have become faster at a much quicker rate than memory access time, creating wide gap betw...
The memories of real life computers usually have a hierarchical structure with levels like registers...
This paper explores the relation between the structured parallelism exposed by the Decomposable BSP ...
This paper formulates and investigates the question of whether a given algorithm can be coded in a w...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
The evolution of computing technology towards the ultimate physical limits makes communication the d...
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using ...
This paper investigates the design of parallel algorithmic strategies that address the efficient use...
In order to mitigate the impact of the constantly widening gap between processor speed and main memo...
We introduce a physical analogy to describe problems and high-performance concurrent computers on wh...