In this article, we introduce SPLAT (Static and Profiled Data Locality Analysis Tool). The tool's purpose is to provide a fast study of memory behavior without the necessity of a costly memory simulator. SPLAT consists of a static locality analysis enhanced by simple profiling data. Its overhead is low because it performs most of the analysis at compile time, and because the required profiling support is just a basic-block-execution count. Many commercial compilers support this profiling option. Compared with simulation techniques, SPLAT's estimation technique is highly accurate for numeric codes.Peer Reviewe
The trend in computer architecture is that processor speeds are increasing rapidly compared to memor...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
In POPL 2002, Petrank and Rawitz showed a universal result---finding optimal data placement is not o...
This paper presents a tool based on a new approach for analyzing the locality exhibited by data memo...
Data locality is central to modern computer designs. The widening gap between processor speed and me...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
As computing efficiency becomes constrained by hardware scaling limitations, code optimization grows...
The growing gap between processor clock speed and DRAM access time puts new demands on software and ...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
The growing processor/memory performance gap causes the performance of many codes to be limited by m...
Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak process...
Over the past decades, core speeds have been improving at a much higher rate than memory bandwidth. ...
The widening memory gap reduces performance of applications with poor data locality. Therefore, ther...
Cache is one of the most widely used components in today's computing systems. Its performance is hea...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
The trend in computer architecture is that processor speeds are increasing rapidly compared to memor...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
In POPL 2002, Petrank and Rawitz showed a universal result---finding optimal data placement is not o...
This paper presents a tool based on a new approach for analyzing the locality exhibited by data memo...
Data locality is central to modern computer designs. The widening gap between processor speed and me...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
As computing efficiency becomes constrained by hardware scaling limitations, code optimization grows...
The growing gap between processor clock speed and DRAM access time puts new demands on software and ...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
The growing processor/memory performance gap causes the performance of many codes to be limited by m...
Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak process...
Over the past decades, core speeds have been improving at a much higher rate than memory bandwidth. ...
The widening memory gap reduces performance of applications with poor data locality. Therefore, ther...
Cache is one of the most widely used components in today's computing systems. Its performance is hea...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
The trend in computer architecture is that processor speeds are increasing rapidly compared to memor...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
In POPL 2002, Petrank and Rawitz showed a universal result---finding optimal data placement is not o...