Data movement between memory and CPU is a well-known energy bottleneck for analytics. Near-Memory Processing (NMP) is a promising approach for eliminating this bottleneck by shifting the bulk of the computation toward memory arrays in emerging stacked DRAM chips. Recent work in this space has been limited to regular computations that can be localized to a single DRAM partition. This paper examines a Join workload, which is fundamental to analytics and is characterized by irregular memory access patterns. We consider several join algorithms and show that while near-data execution can improve both energy-efficiency and performance, effective NMP algorithms must consider locality, access granularity, and microarchitecture of the stacked memory...
A large fraction of MapReduce execution time is spent processing the Map phase, and a large fraction...
As the performance of DRAM devices falls more and more behind computing capabilities, the limitation...
The memory system is a major bottleneck in achieving high performance and energy efficiency for vari...
High-performance analytical data processing systems often run on servers with large amounts of main ...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
Recent technology advances in memory system design, along with 3D stacking, have made near-data proc...
The architectural changes introduced with multicore CPUs have triggered a redesign of main-memory jo...
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to sy...
Abstract—The architectural changes introduced with multi-core CPUs have triggered a redesign of main...
Previous work [1] has claimed that the best performing implementation of in-memory hash joins is bas...
Traditionally, analytical database engines have used task parallelism provided by modern multisocket...
Abstract—The end of Dennard scaling has made all sys-tems energy-constrained. For data-intensive app...
Database management systems comprise various algorithms for efficiently retrieving and managing data...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
A large fraction of MapReduce execution time is spent processing the Map phase, and a large fraction...
As the performance of DRAM devices falls more and more behind computing capabilities, the limitation...
The memory system is a major bottleneck in achieving high performance and energy efficiency for vari...
High-performance analytical data processing systems often run on servers with large amounts of main ...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
Recent technology advances in memory system design, along with 3D stacking, have made near-data proc...
The architectural changes introduced with multicore CPUs have triggered a redesign of main-memory jo...
The increasing demand for extracting value out of ever-growing data poses an ongoing challenge to sy...
Abstract—The architectural changes introduced with multi-core CPUs have triggered a redesign of main...
Previous work [1] has claimed that the best performing implementation of in-memory hash joins is bas...
Traditionally, analytical database engines have used task parallelism provided by modern multisocket...
Abstract—The end of Dennard scaling has made all sys-tems energy-constrained. For data-intensive app...
Database management systems comprise various algorithms for efficiently retrieving and managing data...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
A large fraction of MapReduce execution time is spent processing the Map phase, and a large fraction...
As the performance of DRAM devices falls more and more behind computing capabilities, the limitation...
The memory system is a major bottleneck in achieving high performance and energy efficiency for vari...