Memory performance can be studied, process behavior can be characterized, and application performance can be improved through the use of sampled performance monitor event traces. As an example, this paper demonstrates how sampled traces of the TPC-C benchmark executed on eight- and 32-processor configurations of the IBM eServer pSeries 690 (p690) are analyzed to identify the resolution sites of level-two (L2) cache data-load misses and study the heavily-hit resolution sites, i.e., level-three (L3) caches and main memory, with the goal of recognizing the heavily-hit regions of the application’s address space, segments, pages, cache blocks, routines, instructions, and data structures. Preliminary data analysis of the traces, using a powerful ...
With the software applications increasing in complexity, description of hardware is becoming increas...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
As the processor-memory performance gap continues to grow, so does the need for effective tools and ...
One of the major architectural design considerations for any computer system is that of the memory s...
The research that we have performed in collaboration with IBM uses sampled event traces, which were ...
Because of the increasing gap between processor frequency and Dynamic Random Access Memory (DRAM) sp...
The growing gap between processor and memory speeds has lead to complex memory hierarchies as proces...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
Modern processors incorporate several performance monitoring units, which can be used to count event...
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep me...
The growing gap between processor and memory speeds results in complex memory hierarchies as process...
On multicore processors, co-executing applications compete for shared resources, such as cache capac...
Application performance on modern microprocessors depends heavily on performance related characteris...
Data locality is central to modern computer designs. The widening gap between processor speed and me...
With the software applications increasing in complexity, description of hardware is becoming increas...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
As the processor-memory performance gap continues to grow, so does the need for effective tools and ...
One of the major architectural design considerations for any computer system is that of the memory s...
The research that we have performed in collaboration with IBM uses sampled event traces, which were ...
Because of the increasing gap between processor frequency and Dynamic Random Access Memory (DRAM) sp...
The growing gap between processor and memory speeds has lead to complex memory hierarchies as proces...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
Modern processors incorporate several performance monitoring units, which can be used to count event...
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep me...
The growing gap between processor and memory speeds results in complex memory hierarchies as process...
On multicore processors, co-executing applications compete for shared resources, such as cache capac...
Application performance on modern microprocessors depends heavily on performance related characteris...
Data locality is central to modern computer designs. The widening gap between processor speed and me...
With the software applications increasing in complexity, description of hardware is becoming increas...
Modern memory systems play a critical role in the performance of applications, but a detailed unders...
As the processor-memory performance gap continues to grow, so does the need for effective tools and ...