This paper describes a program profiling and analysis tool called Gleipnir. Gleipnir collects memory access traces and associates each access with a specific program internal structure such as a thread, a function, a data structure or a scalar variable. The data provided by Gleipnir can be used to analyze how program variables and associated memory accesses map to L-1 as well as higher level cache memories. This information can be used to investigate techniques to refactor data or code to improve memory access performance. It is our hypothesis that optimizing cache performance at all levels is very important to both single-core and multi-core processors. In this paper we will describe the Gleipnir tool and some examples of its use in optimi...
This paper presents a multi-cache profiler for shared memory multiprocessor systems. For each progra...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
To cope with the increasing difference between processor and main memory speeds, modern computer sys...
AbstractThis paper describes a program profiling and analysis tool called Gleipnir. Gleipnir collect...
Embedded and high performance applications often require fine-tuning to improve their performance. T...
AbstractApplication analysis is facilitated through a number of program profiling tools. The tools v...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
Software applications’ performance is hindered by a variety of factors, but most notably by the well...
Application performance on modern microprocessors depends heavily on performance related characteris...
Program redundancy analysis and optimization have been an important component in optimizing compiler...
The divergence between processor and memory performance has been a well discussed aspect of computer...
Minimizing power, increasing performance, and delivering effective memory bandwidth are today's prim...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
To cope with the increasing difference between processor and main memory speeds, modern computer sys...
This paper presents a multi-cache profiler for shared memory multiprocessor systems. For each progra...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
To cope with the increasing difference between processor and main memory speeds, modern computer sys...
AbstractThis paper describes a program profiling and analysis tool called Gleipnir. Gleipnir collect...
Embedded and high performance applications often require fine-tuning to improve their performance. T...
AbstractApplication analysis is facilitated through a number of program profiling tools. The tools v...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
Software applications’ performance is hindered by a variety of factors, but most notably by the well...
Application performance on modern microprocessors depends heavily on performance related characteris...
Program redundancy analysis and optimization have been an important component in optimizing compiler...
The divergence between processor and memory performance has been a well discussed aspect of computer...
Minimizing power, increasing performance, and delivering effective memory bandwidth are today's prim...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
To cope with the increasing difference between processor and main memory speeds, modern computer sys...
This paper presents a multi-cache profiler for shared memory multiprocessor systems. For each progra...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
To cope with the increasing difference between processor and main memory speeds, modern computer sys...