Improving locality of memory accesses in current and future multi-core platforms is a key to efficiently exploit those platforms. Irregular applications, which operate on pointer-based data structures, are hard to optimize in modern computer architectures due to their intrinsic unpredictable patterns of memory accesses. In this paper we explore a memory locality-driven set of data-structures in order to attenuate the memory bandwidth limitations from typical irregular algorithms. We identify the inefficiencies in the standard Java implementation of a priority-queue as one of the main memory limitations in Prim’s Minimal Spanning Tree algorithm. We also present a priority-queue using the data layout inspired in Van Emde Boas for ordering hea...
textAs increases in processor speed continue to outpace increases in cache and memory speed, progra...
An important class of scientific codes access memory in an irregular manner. Because irregular acce...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1997. Simultaneously published...
One key issue to design parallel applications that scale on multicore systems is how to overcome the...
The growing gap between processor and memory speeds is motivating the need for optimization strategi...
Given the large communication overheads characteristic of modern parallel machines, optimizations th...
This paper presents the Gaspar data-centric framework to develop high performance parallel applicati...
Over the past decades, core speeds have been improving at a much higher rate than memory bandwidth. ...
The delivered performance on modern processors that employ deep memory hierarchies is closely relate...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
The evolution of computing technology towards the ultimate physical limits makes communication the d...
Recently, multi-cores chips have become omnipresent in computer systems ranging from high-end server...
The recent past has seen single processing systems becoming obsolete and multiprocessor systems taki...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
This paper describes a technique for improving the data ref-erence locality of parallel programs usi...
textAs increases in processor speed continue to outpace increases in cache and memory speed, progra...
An important class of scientific codes access memory in an irregular manner. Because irregular acce...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1997. Simultaneously published...
One key issue to design parallel applications that scale on multicore systems is how to overcome the...
The growing gap between processor and memory speeds is motivating the need for optimization strategi...
Given the large communication overheads characteristic of modern parallel machines, optimizations th...
This paper presents the Gaspar data-centric framework to develop high performance parallel applicati...
Over the past decades, core speeds have been improving at a much higher rate than memory bandwidth. ...
The delivered performance on modern processors that employ deep memory hierarchies is closely relate...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
The evolution of computing technology towards the ultimate physical limits makes communication the d...
Recently, multi-cores chips have become omnipresent in computer systems ranging from high-end server...
The recent past has seen single processing systems becoming obsolete and multiprocessor systems taki...
We articulate the need for managing (data) locality automatically rather than leaving it to the prog...
This paper describes a technique for improving the data ref-erence locality of parallel programs usi...
textAs increases in processor speed continue to outpace increases in cache and memory speed, progra...
An important class of scientific codes access memory in an irregular manner. Because irregular acce...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1997. Simultaneously published...