Temporal locality optimizations for stencil operations for parallel object-oriented scientific frameworks on cache-based architectures

Bassetti, F.
Davis, K.
Quinlan, D.

Publication date

December 1998

Publisher

Los Alamos National Laboratory

Abstract

High-performance scientific computing relies increasingly on high-level large-scale object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental component of such object-oriented frameworks--a parallel or serial array-class library--provides an opportunity for increasingly sophisticated compile-time optimization techniques. This paper describes a technique for introducing cache blocking suitable for certain classes of numerical algorithms, demonstrates and analyzes the resulting performance gai...