Stacked DRAM memories have become a reality in High-Performance Computing (HPC) architectures. These memories provide much higher bandwidth while consuming less power than traditional off-chip memories, but their limited memory capacity is insufficient for modern HPC systems. For this reason, both stacked DRAM and off-chip memories are expected to co-exist in HPC architectures, giving raise to different approaches for architecting the stacked DRAM in the system. This paper proposes a runtime approach to transparently manage stacked DRAM memories in task-based programming models. In this approach the runtime system is in charge of copying the data accessed by the tasks to the stacked DRAM, without any complex hardware support nor modificatio...
Despite the success of parallel architectures and domain-specific accelerators in boosting the perfo...
Byte-addressable non-volatile memories (NVM) have been envisioned as a new tier in computer systems,...
In this paper we focus on common data reorganization op-erations such as shuffle, pack/unpack, swap,...
Stacked DRAM memories have become a reality in High-Performance Computing (HPC) architectures. These...
Abstract—Recent technology advancements allow for the integration of large memory structures on-die ...
During the last decade, managed runtime systems have been constantly evolving to become capable of e...
textContemporary DRAM systems have maintained impressive scaling by managing a careful balance betwe...
the tight integration of significant quantities of DRAM with high-performance computation logic. How...
Most computing systems are heavily dependent on their main memories, as their primary storage, or as...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics P...
As device technologies scale in the nanometer era, the current off-chip DRAM technologies are very c...
The memory system is a major bottleneck in achieving high performance and energy efficiency for vari...
Efficiently managing the memory subsystem of modern multi/manycore architectures is increasingly bec...
Abstract—Memory size has long limited large-scale appli-cations on high-performance computing (HPC) ...
Despite the success of parallel architectures and domain-specific accelerators in boosting the perfo...
Byte-addressable non-volatile memories (NVM) have been envisioned as a new tier in computer systems,...
In this paper we focus on common data reorganization op-erations such as shuffle, pack/unpack, swap,...
Stacked DRAM memories have become a reality in High-Performance Computing (HPC) architectures. These...
Abstract—Recent technology advancements allow for the integration of large memory structures on-die ...
During the last decade, managed runtime systems have been constantly evolving to become capable of e...
textContemporary DRAM systems have maintained impressive scaling by managing a careful balance betwe...
the tight integration of significant quantities of DRAM with high-performance computation logic. How...
Most computing systems are heavily dependent on their main memories, as their primary storage, or as...
Abstract—This paper analyzes the trade-offs in architecting stacked DRAM either as part of main memo...
Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics P...
As device technologies scale in the nanometer era, the current off-chip DRAM technologies are very c...
The memory system is a major bottleneck in achieving high performance and energy efficiency for vari...
Efficiently managing the memory subsystem of modern multi/manycore architectures is increasingly bec...
Abstract—Memory size has long limited large-scale appli-cations on high-performance computing (HPC) ...
Despite the success of parallel architectures and domain-specific accelerators in boosting the perfo...
Byte-addressable non-volatile memories (NVM) have been envisioned as a new tier in computer systems,...
In this paper we focus on common data reorganization op-erations such as shuffle, pack/unpack, swap,...