Several studies and recent real world designs have promoted sharing of underutilized resources between cores in a multicore processor to achieve better performance/power. It has been argued that when utilization of such resources is low, sharing has negligible impact on performance, while offering considerable area and power benefits. In this paper we investigate the performance and performance/Watt implications of sharing large and underutilized resources between pairs of cores in a multicore. We first study sharing of the entire floating-point datapath (including reservation stations and execution units) by two cores, similar to AMD’s Bulldozer. We find that while this architecture results in power savings, for certain workload combinatio...
High performance computing (HPC) applications have parallel code sections that must scale to large n...
Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execu...
Multiprocessor application performance can be limited by the operating system when the application u...
Abstract—Several studies and real world designs have advocated the sharing of large execution units ...
Since many years, we observe a shift from classical multiprocessor systems tomulticores, which tight...
As the push for parallelism continues to increase the number of cores on a chip, system design has b...
We show that when multi-threaded benchmarks are executed on a Chip Multiprocessor (CMP), the threads...
Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execu...
Simultaneous MultiThreading (SMT) achieves better system resource utilization and higher performance...
hmultiprocessors (CMPs) containing two to eight cores with support for up to eight hardware thread c...
Transport triggered architecture processors may have function unit input registers, which allow oper...
Resource sharing occurs when multiple active processes or software components compete for system res...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Technology scaling trends have forced designers to consider alternatives to deeply pipelining aggres...
High performance computing (HPC) applications have parallel code sections that must scale to large n...
Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execu...
Multiprocessor application performance can be limited by the operating system when the application u...
Abstract—Several studies and real world designs have advocated the sharing of large execution units ...
Since many years, we observe a shift from classical multiprocessor systems tomulticores, which tight...
As the push for parallelism continues to increase the number of cores on a chip, system design has b...
We show that when multi-threaded benchmarks are executed on a Chip Multiprocessor (CMP), the threads...
Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execu...
Simultaneous MultiThreading (SMT) achieves better system resource utilization and higher performance...
hmultiprocessors (CMPs) containing two to eight cores with support for up to eight hardware thread c...
Transport triggered architecture processors may have function unit input registers, which allow oper...
Resource sharing occurs when multiple active processes or software components compete for system res...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Technology scaling trends have forced designers to consider alternatives to deeply pipelining aggres...
High performance computing (HPC) applications have parallel code sections that must scale to large n...
Microarchitectural techniques, such as superscalar instruction issue, Out-Of-Order instruction execu...
Multiprocessor application performance can be limited by the operating system when the application u...