The exploitation of locality of reference in shared memory multiprocessors is one of the most important problems in parallel processing today. Locality can be managed in several levels: hardware, operating system, runtime environment of the compiler, user level. In this paper we investigate the problem of exploiting locality at the operating system level and its interactions with the compiler and the architecture. Our main conclusion, based on trace-driven simulations of real applications, is that exploitation of locality is effective only if all three levels cooperate
In the context of sequential computers, it is common practice to exploit temporal locality of refer...
On recent high-performance multiprocessors, there is a potential conflict between the goals of achie...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
The exploitation of locality of reference in shared memory multiprocessors is one of the most import...
Data locality is a well-recognized requirement for the development of any parallel application, but ...
We define a set of overhead functions that capture the salient artifacts representing the interactio...
We propose a synthetic address trace generation model which combine the accuracy advantage of trace-...
The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programm...
We present a unified approach to locality optimization that employs both data and control transforma...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
The locality of reference in program behavior has been studied and modeled extensively because of it...
Improving program locality has become increasingly important on modern computer systems. An effectiv...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
Data locality is one of the most important characteristics of programs. Its study has significant in...
We have developed compiler algorithms that analyze coarse-grained, explicitly parallel programs and ...
In the context of sequential computers, it is common practice to exploit temporal locality of refer...
On recent high-performance multiprocessors, there is a potential conflict between the goals of achie...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...
The exploitation of locality of reference in shared memory multiprocessors is one of the most import...
Data locality is a well-recognized requirement for the development of any parallel application, but ...
We define a set of overhead functions that capture the salient artifacts representing the interactio...
We propose a synthetic address trace generation model which combine the accuracy advantage of trace-...
The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programm...
We present a unified approach to locality optimization that employs both data and control transforma...
Abstract. Parallel graph reduction is a model for parallel program exe-cution in which shared-memory...
The locality of reference in program behavior has been studied and modeled extensively because of it...
Improving program locality has become increasingly important on modern computer systems. An effectiv...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
Data locality is one of the most important characteristics of programs. Its study has significant in...
We have developed compiler algorithms that analyze coarse-grained, explicitly parallel programs and ...
In the context of sequential computers, it is common practice to exploit temporal locality of refer...
On recent high-performance multiprocessors, there is a potential conflict between the goals of achie...
It is often assumed that computational load balance cannot be achieved in parallel and distributed s...