Traditionally, in distributed memory architectures, locality maintenance and load balancing are seen as user level activities involving compiler and runtime system support in software. Such software solutions require an explicit phase of execution, requiring the application to suspend its activities. This paper presents the first (to our knowledge) architecture-level scheme for extracting locality concurrent with the application execution. An artificial neural network coprocessor is used for dynamically monitoring processor reference streams to learn temporally emergent utilities of data elements in ongoing local computations. This facilitates use of kernel-level load balancing schemes thus, easing the user programming burden. The kern...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
The exploitation of locality of reference in shared memory multiprocessors is one of the most import...
This paper describes a software architecture designed as a support for tackling the load distributio...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
We define a set of overhead functions that capture the salient artifacts representing the interactio...
This paper presents a simple load balancing algorithm and its probabilistic analysis. Unlike most of...
We report on the improvements. that can be achieved by applying machine learning techniques, in part...
A parallel concurrent application runs most efficiently and quickly when the workload is distributed...
The compilation of high-level programming languages for parallel machines faces two challenges: maxi...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
The Flagship Parallel Reduction Machine is designed to execute declarative language programs based o...
Load Balancing in Parallel Computers: Theory and Practice is about the essential software technique ...
International audienceData locality optimization is a well-known goal when handling programs that mu...
. In this paper, we present a cohesive, practical load balancing framework that addresses many short...
Modern computer architectures expose an increasing number of parallel features supported by complex ...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
The exploitation of locality of reference in shared memory multiprocessors is one of the most import...
This paper describes a software architecture designed as a support for tackling the load distributio...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
We define a set of overhead functions that capture the salient artifacts representing the interactio...
This paper presents a simple load balancing algorithm and its probabilistic analysis. Unlike most of...
We report on the improvements. that can be achieved by applying machine learning techniques, in part...
A parallel concurrent application runs most efficiently and quickly when the workload is distributed...
The compilation of high-level programming languages for parallel machines faces two challenges: maxi...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
The Flagship Parallel Reduction Machine is designed to execute declarative language programs based o...
Load Balancing in Parallel Computers: Theory and Practice is about the essential software technique ...
International audienceData locality optimization is a well-known goal when handling programs that mu...
. In this paper, we present a cohesive, practical load balancing framework that addresses many short...
Modern computer architectures expose an increasing number of parallel features supported by complex ...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
The exploitation of locality of reference in shared memory multiprocessors is one of the most import...
This paper describes a software architecture designed as a support for tackling the load distributio...