Future multi-core and many-core processors are likely to contain one or more high performance out-of-order cores to execute sequential programs and the sequential parts of parallel programs. These out-of-order cores will have to be more energy (and area) efficient than their present-day counterparts. This dissertation focuses on reducing the energy consumption of conventional out-of-order cores while maintaining or improving their performance. Specifically, it focuses on the store-load datapath. The store-load datapath, which consists of the data cache and translation lookaside buffer, load and store queues, and memory dependence predictor, is one of the most energy hungry parts of a traditional out-of-order core, accounting for as much as ...
Memory accesses in modern processors are both far slower and vastly more energy-expensive than the a...
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor...
Driven by the motivation to expose instruction-level parallelism (ILP), microprocessor cores have ev...
Future multi-core and many-core processors are likely to contain one or more high performance out-of...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
The first level data cache in modern processors has become a major consumer of energy due to its inc...
High-performance processors use a large set–associative L1 data cache with multiple ports. As clock ...
Minimizing power, increasing performance, and delivering effective memory bandwidth are today's prim...
Around 2003, newly activated power constraints caused single-thread performance growth to slow drama...
In recent years, CPU performance has become energy constrained. If performance is to continue increa...
Historically, energy constrained devices (ECDs) have favored simple in-order pipelines over out-of-o...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
Buffer cache replacement schemes play an important role in conserving memory energy. Conventional al...
To alleviate the memory wall problem, current architec-tural trends suggest implementing large instr...
L1 data caches in high-performance processors continue to grow in set associativity. Higher associat...
Memory accesses in modern processors are both far slower and vastly more energy-expensive than the a...
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor...
Driven by the motivation to expose instruction-level parallelism (ILP), microprocessor cores have ev...
Future multi-core and many-core processors are likely to contain one or more high performance out-of...
Modern out-of-order processor architectures focus significantly on the high performance execution of...
The first level data cache in modern processors has become a major consumer of energy due to its inc...
High-performance processors use a large set–associative L1 data cache with multiple ports. As clock ...
Minimizing power, increasing performance, and delivering effective memory bandwidth are today's prim...
Around 2003, newly activated power constraints caused single-thread performance growth to slow drama...
In recent years, CPU performance has become energy constrained. If performance is to continue increa...
Historically, energy constrained devices (ECDs) have favored simple in-order pipelines over out-of-o...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
Buffer cache replacement schemes play an important role in conserving memory energy. Conventional al...
To alleviate the memory wall problem, current architec-tural trends suggest implementing large instr...
L1 data caches in high-performance processors continue to grow in set associativity. Higher associat...
Memory accesses in modern processors are both far slower and vastly more energy-expensive than the a...
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor...
Driven by the motivation to expose instruction-level parallelism (ILP), microprocessor cores have ev...