In this paper, we present compiler algorithms for detecting references to stale data in sharedmemory multiprocessors. The algorithm consists of two key analysis techniques, stale reference detection and locality preserving analysis. While the stale reference detection finds the memory reference patterns that may violate cache coherence, the locality preserving analysis minimizes the number of such stale references by analyzing both temporal and spatial reuses. By computing the regions referenced by arrays inside loops, we extend the previous scalar algorithms [8] for more precise analysis. We develop a full interprocedural array data-flow algorithm, which performs both bottom-up side-effect analysis and top-down context analysis on the proc...
Cache memories were incorporated in microprocessors in the early times and represent the most common...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
Abstract—Exploiting locality of reference is key to realizing high levels of performance on modern p...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
This dissertation presents a systematic approach to reduction of cache coherence overhead in shared-...
This document describes a set of new techniques for improving the efficiency of compiler-directed so...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
As multicore processors implementing shared-memory programming models have become commonplace, analy...
Although it is convenient to program large-scale multiprocessors as though all processors shared acc...
We have developed compiler algorithms that analyze coarse-grained, explicitly parallel programs and ...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
The cache coherence maintenance problem has been the major obstacle in using private cache memory to...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Cache memories were incorporated in microprocessors in the early times and represent the most common...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
Abstract—Exploiting locality of reference is key to realizing high levels of performance on modern p...
Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache c...
This dissertation presents a systematic approach to reduction of cache coherence overhead in shared-...
This document describes a set of new techniques for improving the efficiency of compiler-directed so...
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the t...
As multicore processors implementing shared-memory programming models have become commonplace, analy...
Although it is convenient to program large-scale multiprocessors as though all processors shared acc...
We have developed compiler algorithms that analyze coarse-grained, explicitly parallel programs and ...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
The cache coherence maintenance problem has been the major obstacle in using private cache memory to...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
Cache coherence is one of the main challenges to tackle when designing a shared-memory multiprocesso...
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory archit...
Cache memories were incorporated in microprocessors in the early times and represent the most common...
In large scale machines, thousands of processor cycles, in other words, missed opportunities to issu...
Abstract—Exploiting locality of reference is key to realizing high levels of performance on modern p...