Recent work has shown that multithreaded workloads running in execution-driven, full-system simulation environments cannot use instructions per cycle (IPC) as a valid performance metric due to non-deterministic program behavior. Unfortunately, invalidating IPC as a performance metric introduces its own host of difficulties: special workload setup, consideration of cold-start and end-effects, statistical methodologies leading to increased simulation bandwidth, and workload-specific, higher-level metrics to measure performance. This paper explores the non-determinism problem in multithreaded programs, describes a method to eliminate non-determinism across simulations of different experimental machine models, and demonstrates the suitability o...
Almost all new consumer-grade processors are capable of executing multiple programs simultaneously. ...
Microarchitectural simulation of multithreaded architectures with shared resources, such as simultan...
Along with commercial chip-multiprocessors (CMPs) integrating more and more cores, memory systems ar...
As multiprocessors become mainstream, techniques to ad-dress efficient simulation of multi-threaded ...
Nowadays, multithreaded architectures are becoming more and more popular. In order to evaluate their...
Abstract—Weighted speedup is nowadays the most commonly used multiprogram workload performance metri...
Weighted speedup is nowadays the most commonly used multiprogram workload performance metric. Weight...
Detailed, cycle-accurate processor simulation is an inte-gral component of the design and study of c...
Abstract—Sampling is a well-known workload reduction technique that allows one to speed up architect...
Composing a representative multi-program multi-core workload is non-trivial. A multi-core processor ...
Multithreaded architectures are becoming more and more popular. In order to evaluate their behavior,...
As the complexity of processors increases, it becomes harder for designers to understand the non-tri...
The Memory Wall continues to be a problem with modern systems design. While the steady increase in p...
Increasingly complex consumer electronics applications call for embedded processors with higher perf...
Simulation remains an important component in the design of multicore processor architectures, just a...
Almost all new consumer-grade processors are capable of executing multiple programs simultaneously. ...
Microarchitectural simulation of multithreaded architectures with shared resources, such as simultan...
Along with commercial chip-multiprocessors (CMPs) integrating more and more cores, memory systems ar...
As multiprocessors become mainstream, techniques to ad-dress efficient simulation of multi-threaded ...
Nowadays, multithreaded architectures are becoming more and more popular. In order to evaluate their...
Abstract—Weighted speedup is nowadays the most commonly used multiprogram workload performance metri...
Weighted speedup is nowadays the most commonly used multiprogram workload performance metric. Weight...
Detailed, cycle-accurate processor simulation is an inte-gral component of the design and study of c...
Abstract—Sampling is a well-known workload reduction technique that allows one to speed up architect...
Composing a representative multi-program multi-core workload is non-trivial. A multi-core processor ...
Multithreaded architectures are becoming more and more popular. In order to evaluate their behavior,...
As the complexity of processors increases, it becomes harder for designers to understand the non-tri...
The Memory Wall continues to be a problem with modern systems design. While the steady increase in p...
Increasingly complex consumer electronics applications call for embedded processors with higher perf...
Simulation remains an important component in the design of multicore processor architectures, just a...
Almost all new consumer-grade processors are capable of executing multiple programs simultaneously. ...
Microarchitectural simulation of multithreaded architectures with shared resources, such as simultan...
Along with commercial chip-multiprocessors (CMPs) integrating more and more cores, memory systems ar...