We describe an important memory optimization that arises in the presence of aggregate data structures such as arrays and structs in a C/C++ based system design methodology. We present an algorithm for determining an optimized memory layout of such data. Our implementation consists of a pointer analysis and resolution phase, followed by memory layout optimization. Experiments on typical applications from the DSP domain result in up to 44% improvement in memory performance
Abstract. Programs accessing disk-resident arrays, called out-of-core programs, perform poorly in ge...
Despite the potential importance of data structure layouts and traversal patterns, compiler transfor...
Modern multicore embedded systems often execute applications that rely heavily on concurrent data st...
This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Programmers of high-performance applications face many challenging aspects of contemporary hardware ...
One key issue to design parallel applications that scale on multicore systems is how to overcome the...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
Providing high performance for pointer-intensive programs on modern architectures is an increasingly...
Hardware trends have produced an increasing disparity between processor speeds and memory access tim...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
This paper describes Automatic Pool Allocation, a transformation framework that segregates distinct ...
As the gap between processor power and memory speed continues to widen, cache performance of modern ...
As the amount of data used by programs increases due to the growth of the hardware storage capacity,...
As the gap between processor and memory continues to grow Memory performance becomes a key performan...
Abstract. Programs accessing disk-resident arrays, called out-of-core programs, perform poorly in ge...
Despite the potential importance of data structure layouts and traversal patterns, compiler transfor...
Modern multicore embedded systems often execute applications that rely heavily on concurrent data st...
This paper introduces a dynamic layout optimization strategy to minimize the number of cycles spent ...
The system efficiency and throughput of most architectures are critically dependent on the ability o...
Programmers of high-performance applications face many challenging aspects of contemporary hardware ...
One key issue to design parallel applications that scale on multicore systems is how to overcome the...
We present a technique to increase data cache utilization of pointer-based programs. These caches ar...
Providing high performance for pointer-intensive programs on modern architectures is an increasingly...
Hardware trends have produced an increasing disparity between processor speeds and memory access tim...
The literature has witnessed much work aimed at improving the efficiency of mernory systems. The mot...
This paper describes Automatic Pool Allocation, a transformation framework that segregates distinct ...
As the gap between processor power and memory speed continues to widen, cache performance of modern ...
As the amount of data used by programs increases due to the growth of the hardware storage capacity,...
As the gap between processor and memory continues to grow Memory performance becomes a key performan...
Abstract. Programs accessing disk-resident arrays, called out-of-core programs, perform poorly in ge...
Despite the potential importance of data structure layouts and traversal patterns, compiler transfor...
Modern multicore embedded systems often execute applications that rely heavily on concurrent data st...