Inexpensive DRAMs have created new opportunities for in-memory data analytics. However, the major bottleneck in such systems is high memory access latency. Traditionally, this problem is solved with large cache hierarchies that only benefit regular applications. Alternatively, many data-intensive applications exhibit irregular behavior. Hardware multithreading can better cope with high latency seen in such applications. This article implements a multithreaded prototype (MTP) on FPGAs for the relational selection operator that exhibits control flow irregularity. On a standard TPC-H query evaluation, MTP achieves a bandwidth utilization of 83%, while the CPU and the GPU implementations achieve 61% and 64%, respectively. Besides being bandwidt...
General purpose computing platforms have generally been favored over customized computational setups...
A new trend in processor design is increased on-chip support for multithreading in the form of both ...
General purpose computing platforms have generally been favored over customized computational setups...
Inexpensive DRAMs have created new opportunities for in-memory data analytics. However, the major bo...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The increase in size and decrease in cost of DRAMs has led to a rapid growth of in-memory solutions ...
With computing systems becoming ubiquitous, numerous data sets of extremely large size are becoming ...
Algorithms that exhibit irregular memory access patterns are known to show poor performance on multi...
FPGA-based data processing is becoming increasingly relevant in data centers, as the transformation ...
The decreasing cost of DRAM has made possible and grown the use of in-memory databases. However, mem...
© 2020 Association for Computing Machinery. There has been significant amount of excitement and rece...
The past decade has witnessed the proliferation of new ways to ingest, store, index, and query data....
grantor: University of TorontoMemory latency is becoming an increasingly important perform...
Long memory latencies are mitigated through the use of large cache hierarchies in multi-core archite...
General purpose computing platforms have generally been favored over customized computational setups...
A new trend in processor design is increased on-chip support for multithreading in the form of both ...
General purpose computing platforms have generally been favored over customized computational setups...
Inexpensive DRAMs have created new opportunities for in-memory data analytics. However, the major bo...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The increase in size and decrease in cost of DRAMs has led to a rapid growth of in-memory solutions ...
With computing systems becoming ubiquitous, numerous data sets of extremely large size are becoming ...
Algorithms that exhibit irregular memory access patterns are known to show poor performance on multi...
FPGA-based data processing is becoming increasingly relevant in data centers, as the transformation ...
The decreasing cost of DRAM has made possible and grown the use of in-memory databases. However, mem...
© 2020 Association for Computing Machinery. There has been significant amount of excitement and rece...
The past decade has witnessed the proliferation of new ways to ingest, store, index, and query data....
grantor: University of TorontoMemory latency is becoming an increasingly important perform...
Long memory latencies are mitigated through the use of large cache hierarchies in multi-core archite...
General purpose computing platforms have generally been favored over customized computational setups...
A new trend in processor design is increased on-chip support for multithreading in the form of both ...
General purpose computing platforms have generally been favored over customized computational setups...