The performance of data-intensive applications, when running on modern multi- and many-core processors, is largely determined by their memory access behavior. Its most important contributors are the frequency and latency of off-chip accesses and the extent to which long-latency memory accesses can be overlapped with useful computation or with each other. In this paper we present two methods to better understand application and microarchitectural interactions. An epoch profile is an intuitive way to understand the relationships between three important characteristics: the on-chip cache size, the size of the reorder window of an out-of-order processor, and the frequency of processor stalls caused by long-latency, off-chip requests (epochs). B...
To analyze the performance of applications and architectures, both programmers and architects desire...
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep me...
There exists a divide between the ever-increasing demand for high-performance embedded systems and t...
The performance of data-intensive applications, when running on modern multi- and many-core processo...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Optimizing processors for specific application(s) can substantially improve energy-efficiency. With ...
The performance of memory-bound commercial applications such as databases is limited by increasing m...
International audienceSince several years, classical multiprocessor systems have evolved to multicor...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
The energy demands of modern mobile devices have driven a trend towards heterogeneous multi-core sys...
Programs exhibit significant performance variance in their access to microarchitectural structures. ...
Understanding the behavior of emerging workloads is important for designing next generation micropro...
Abstract—The microarchitectural design space of a new processor is too large for an architect to eva...
To design computers which reach the performance limits of the implementation technology, one must un...
Optimizing processors for (a) specific application(s) can substantially improve energy-efficiency. W...
To analyze the performance of applications and architectures, both programmers and architects desire...
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep me...
There exists a divide between the ever-increasing demand for high-performance embedded systems and t...
The performance of data-intensive applications, when running on modern multi- and many-core processo...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Optimizing processors for specific application(s) can substantially improve energy-efficiency. With ...
The performance of memory-bound commercial applications such as databases is limited by increasing m...
International audienceSince several years, classical multiprocessor systems have evolved to multicor...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
The energy demands of modern mobile devices have driven a trend towards heterogeneous multi-core sys...
Programs exhibit significant performance variance in their access to microarchitectural structures. ...
Understanding the behavior of emerging workloads is important for designing next generation micropro...
Abstract—The microarchitectural design space of a new processor is too large for an architect to eva...
To design computers which reach the performance limits of the implementation technology, one must un...
Optimizing processors for (a) specific application(s) can substantially improve energy-efficiency. W...
To analyze the performance of applications and architectures, both programmers and architects desire...
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep me...
There exists a divide between the ever-increasing demand for high-performance embedded systems and t...