We articulate the need for managing (data) locality automatically rather than leaving it to the programmer, especially in parallel programming systems. To this end, we propose techniques for coupling tightly the computation (including the thread scheduler) and the memory manager so that data and computation can be positioned closely in hardware. Such tight coupling of computation and memory management is in sharp contrast with the prevailing practice of con-sidering each in isolation. For example, memory-management techniques usually abstract the computation as an unknown “mutator”, which is treated as a “black box”. As an example of the approach, in this paper we consider a specific class of parallel computa-tions, nested-parallel computat...
AbstractWe present an efficient memory management scheme for concurrent programming languages where ...
AbstractA uniform general purpose garbage collector may not always provide optimal performance. Some...
. Gardens is a system which supports parallel computation across networks of workstations. This is a...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
International audienceAn important feature of functional programs is that they are parallel by defau...
Abstract — The development of efficient parallel out-of-core applications is often tedious, because ...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
In this paper, we develop an automatic compile-time computation and data decomposition technique for...
On recent high-performance multiprocessors, there is a potential conflict between the goals of achie...
This paper describes a technique for improving the data ref-erence locality of parallel programs usi...
The task parallel programming model allows programmers to express concurrency at a high level of abs...
This paper presents a scheme to manage heap data in the local memory present in each core of a limit...
Development of scalable application codes requires an understanding and exploitation of the locality...
Despite decades of work in this area, the construction of effective loop nest optimizers and paralle...
AbstractWe present an efficient memory management scheme for concurrent programming languages where ...
AbstractA uniform general purpose garbage collector may not always provide optimal performance. Some...
. Gardens is a system which supports parallel computation across networks of workstations. This is a...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
International audienceAn important feature of functional programs is that they are parallel by defau...
Abstract — The development of efficient parallel out-of-core applications is often tedious, because ...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
In this paper, we develop an automatic compile-time computation and data decomposition technique for...
On recent high-performance multiprocessors, there is a potential conflict between the goals of achie...
This paper describes a technique for improving the data ref-erence locality of parallel programs usi...
The task parallel programming model allows programmers to express concurrency at a high level of abs...
This paper presents a scheme to manage heap data in the local memory present in each core of a limit...
Development of scalable application codes requires an understanding and exploitation of the locality...
Despite decades of work in this area, the construction of effective loop nest optimizers and paralle...
AbstractWe present an efficient memory management scheme for concurrent programming languages where ...
AbstractA uniform general purpose garbage collector may not always provide optimal performance. Some...
. Gardens is a system which supports parallel computation across networks of workstations. This is a...