Emerging TSV-based 3D integration technologies have shown great promise to overcome scalability limitations in 2D designs by stacking multiple memory dies on top of a many-core die. Application software developers need programming models and tools to fully exploit the potential of vertically stacked memory. In this work, we focus on efficient data mapping for SPMD parallel applications on an explicitly managed 3D-stacked memory hierarchy, which requires placement of data across multiple vertical memory stacks to be carefully optimized. We propose a programming framework with compiler support that enables array partitioning. Partitions are mapped to the 3D-stacked memory on top of the processor that mostly accesses it to take advantage of th...
Abstract. The shared memory paradigm provides many benefits to the parallel programmer, particular w...
Abstract—OpenMP is a de facto standard interface of the shared address space parallel programming m...
Abstract. This paper presents a source-to-source translation strategy from OpenMP to Global Arrays i...
Emerging TSV-based 3D integration technologies have shown great promise to overcome scalability limi...
Historically, processor performance has increased at a much faster rate than that of main memory and...
This paper aims to address the issue of CPU-memory intercommunication latency with the help of 3D st...
Several recent works have demonstrated the benefits of through-silicon-via (TSV) based 3D integratio...
Abstract—This paper demonstrates a fully functional hard-ware and software design for a 3D stacked m...
International audienceWith the emergence of manycore architectures, the need of on-chip memories suc...
Memory bandwidth has become a major performance bottleneck as more and more cores are integrated ont...
In this paper we address the issue of efficient doall workload distribution on a embedded 3D MPSoC. ...
The novel ScaleMP vSMP architecture employs commodity x86-based servers with an InfiniBand network t...
The objective of this thesis is to optimize the uncore of 3D many-core architectures. More specifica...
Most of today’s state-of-the-art processors for mobile and embedded systems feature on-chip scratchp...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
Abstract. The shared memory paradigm provides many benefits to the parallel programmer, particular w...
Abstract—OpenMP is a de facto standard interface of the shared address space parallel programming m...
Abstract. This paper presents a source-to-source translation strategy from OpenMP to Global Arrays i...
Emerging TSV-based 3D integration technologies have shown great promise to overcome scalability limi...
Historically, processor performance has increased at a much faster rate than that of main memory and...
This paper aims to address the issue of CPU-memory intercommunication latency with the help of 3D st...
Several recent works have demonstrated the benefits of through-silicon-via (TSV) based 3D integratio...
Abstract—This paper demonstrates a fully functional hard-ware and software design for a 3D stacked m...
International audienceWith the emergence of manycore architectures, the need of on-chip memories suc...
Memory bandwidth has become a major performance bottleneck as more and more cores are integrated ont...
In this paper we address the issue of efficient doall workload distribution on a embedded 3D MPSoC. ...
The novel ScaleMP vSMP architecture employs commodity x86-based servers with an InfiniBand network t...
The objective of this thesis is to optimize the uncore of 3D many-core architectures. More specifica...
Most of today’s state-of-the-art processors for mobile and embedded systems feature on-chip scratchp...
The fast emergence of OpenMP as the preferable parallel programming paradigm for small-to-medium sca...
Abstract. The shared memory paradigm provides many benefits to the parallel programmer, particular w...
Abstract—OpenMP is a de facto standard interface of the shared address space parallel programming m...
Abstract. This paper presents a source-to-source translation strategy from OpenMP to Global Arrays i...