To date, data locality optimizing algorithms mostly aim at providing efficient strategies for blocking and reordering loops. But little research has been devoted to the final step, i.e., computing the optimal block size. Optimal block sizes are currently computed as if a cache behaves as a local memory, i.e., cache interference phenomena are ignored. Case-studies have already shown that cache interferences can greatly affect the optimal block size. The purpose of this paper is to propose a methodology for estimating interference misses in a regular do-loop nest, and use that knowledge to derive the optimal block size. First, the different types of interference phenomena are identified, and a method for predicting their occurrence and evalua...
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Applications often under-utilize cache space and there are no software locality optimization techniq...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Cache behavior is complex and inherently unstable, yet it is a critical factor affecting program per...
We develop from first principles an exact model of the behavior of loop nests executing in a memory ...
This paper proposes an optimization by an alternative approach to memory mapping. Caches with low se...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
This paper presents hyperblocking, or hypertiling, a novel optimization technique that makes it poss...
Applications with regular patterns of memory access can experience high levels of cache conflict mis...
The cache interference is found to play a critical role in optimizing cache allocation among concurr...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
Commercial link : http://www.springerlink.de/ ALCHEMY/http://www.springer.comCache memories were inv...
Caches were designed to amortize the cost of memory accesses by moving copies of frequently accessed...
Caches are an important part of architectural and compiler high performance and low-power strategies...
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Applications often under-utilize cache space and there are no software locality optimization techniq...
Most memory references in numerical codes correspond to array references whose indices are affine fu...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
Cache behavior is complex and inherently unstable, yet it is a critical factor affecting program per...
We develop from first principles an exact model of the behavior of loop nests executing in a memory ...
This paper proposes an optimization by an alternative approach to memory mapping. Caches with low se...
The performance of cache memories relies on the locality exhibited by programs. Traditionally this l...
This paper presents hyperblocking, or hypertiling, a novel optimization technique that makes it poss...
Applications with regular patterns of memory access can experience high levels of cache conflict mis...
The cache interference is found to play a critical role in optimizing cache allocation among concurr...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
Commercial link : http://www.springerlink.de/ ALCHEMY/http://www.springer.comCache memories were inv...
Caches were designed to amortize the cost of memory accesses by moving copies of frequently accessed...
Caches are an important part of architectural and compiler high performance and low-power strategies...
This paper describes an algorithm to optimize cache locality in scientic codes on uniprocessor and m...
Applications often under-utilize cache space and there are no software locality optimization techniq...
Most memory references in numerical codes correspond to array references whose indices are affine fu...