The speed at which microprocessors can perform computations is increasing faster than the speed of access to main memory, making efficient use of memory caches ever more important. Because of this, information about the cache behavior of applications is valuable for performance tuning. To be most useful to a programmer, this information should be presented in a way that relates it to data structures at the source code level; we will refer to this as data centric cache information. This disser-tation examines the problem of how to collect such information. We describe tech-niques for accomplishing this using hardware performance monitors and software in-strumentation. We discuss both performance monitoring features that are present in e...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
This thesis proposes a buffered dual access mode cache to reduce power consumption in multicore cach...
The processor-memory gap is widening every year with no prospect of reprieve. More and more latency ...
AbstractFor performance analysis tools to be useful, they need to show the relation of detected bott...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
This dissertation addresses two sets of challenges facing processor design as the industry enters th...
The growing gap between processor clock speed and DRAM access time puts new demands on software and ...
Modern multi-core microprocessors cannot function anymore without memory caches, in multiple layers,...
This thesis evaluates an innovative cache design called, prime-mapped cache. The performance analysi...
With contemporary research focusing its attention primarily on benchmark-driven performance evaluati...
For forty years, transistor counts on integrated circuits have doubled roughly every two years, enab...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...
This thesis proposes a buffered dual access mode cache to reduce power consumption in multicore cach...
The processor-memory gap is widening every year with no prospect of reprieve. More and more latency ...
AbstractFor performance analysis tools to be useful, they need to show the relation of detected bott...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
There is an ever widening performance gap between processors and main memory, a gap bridged by small...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
This dissertation addresses two sets of challenges facing processor design as the industry enters th...
The growing gap between processor clock speed and DRAM access time puts new demands on software and ...
Modern multi-core microprocessors cannot function anymore without memory caches, in multiple layers,...
This thesis evaluates an innovative cache design called, prime-mapped cache. The performance analysi...
With contemporary research focusing its attention primarily on benchmark-driven performance evaluati...
For forty years, transistor counts on integrated circuits have doubled roughly every two years, enab...
Improving cache performance requires understanding cache behavior. However, measuring cache performa...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Measurements of actual supercomputer cache performance has not been previously undertaken. PFC-Sim i...