High performance computing (HPC) applications have parallel code sections that must scale to large numbers of cores, which makes them sensitive to serial regions. Current supercomputing systems with heterogeneous or asymmetric CMPs (ACMP) combine few high-performance big cores for serial regions, together with many low-power lean cores for throughput computing. The low requirements of HPC applications in the core front-end lead some designs, such as SMT and GPU cores, to share front-end structures including the instruction cache (I-cache). However, little work exists to analyze the benefit of sharing the I-cache among full cores, which seems compelling as a solution to reduce silicon area and power. This paper analyzes the performance, powe...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
Several Chip-Multiprocessor designs today leverage tightly-coupled computing clusters as a building ...
The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CM...
High performance computing (HPC) applications have parallel code sections that must scale to large n...
There is a need to increase performance under the same power and area envelope to achieve Exascale t...
The evolution of microprocessor design in the last few decades has changed significantly, moving fro...
Abstract—Several studies and real world designs have advocated the sharing of large execution units ...
Recent many-core processors such as Intel’s Xeon Phi and GPGPUs specialize in running highly scalabl...
In the last years, embedded systems have evolved so that they offer capabilities we could only find ...
To meet the growing computation-intensive applications and the needs of low-power, high-performance ...
L1 instruction caches in many-core systems represent a siz-able fraction of the total power consumpt...
From single-core CPUs to detachable compute accelerators, supercomputers made a tremendous progress ...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
Several studies and recent real world designs have promoted sharing of underutilized resources betwe...
One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management ...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
Several Chip-Multiprocessor designs today leverage tightly-coupled computing clusters as a building ...
The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CM...
High performance computing (HPC) applications have parallel code sections that must scale to large n...
There is a need to increase performance under the same power and area envelope to achieve Exascale t...
The evolution of microprocessor design in the last few decades has changed significantly, moving fro...
Abstract—Several studies and real world designs have advocated the sharing of large execution units ...
Recent many-core processors such as Intel’s Xeon Phi and GPGPUs specialize in running highly scalabl...
In the last years, embedded systems have evolved so that they offer capabilities we could only find ...
To meet the growing computation-intensive applications and the needs of low-power, high-performance ...
L1 instruction caches in many-core systems represent a siz-able fraction of the total power consumpt...
From single-core CPUs to detachable compute accelerators, supercomputers made a tremendous progress ...
textThis dissertation explores techniques for reducing the costs of inter-processor communication i...
Several studies and recent real world designs have promoted sharing of underutilized resources betwe...
One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management ...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
Several Chip-Multiprocessor designs today leverage tightly-coupled computing clusters as a building ...
The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CM...