GPUs are increasingly adopted for large-scale database processing, where data accesses represent the major part of the computation. If the data accesses are irregular, like hash table accesses or random sampling, the GPU performance can suffer. Especially when scaling such accesses beyond 2GB of data, a performance decrease of an order of magnitude is encountered. This paper analyzes the source of the slowdown through extensive micro-benchmarking, attributing the root cause to the Translation Lookaside Buffer (TLB). Using the micro-benchmarks, the TLB hierarchy and structure are fully analyzed on two different GPU architectures, identifying never-before-published TLB sizes that can be used for efficient large-scale application tuning. Based...
Once exotic, computational accelerators are now commonly available in many computing systems. Graphi...
The memory system has been evolving at a fast pace recently, driven by the emergence of large-scale ...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
GPUs are increasingly adopted for large-scale database processing, where data accesses represent the...
Big Data applications are trivially parallelizable because they typically consist of simple and stra...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
We are in the computing era of super-zetta data bytes (a.k.a. Big Data). Big Data is critical to dev...
Algorithms for processing large, unstructured data sets have shown great promise in implementations ...
2018-08-02Recent exponential growth of the data sets size demanded by modern big data applications r...
In-memory big-data processing is rapidly emerging as a promising solution for large-scale data analy...
© 2020 Association for Computing Machinery. There has been significant amount of excitement and rece...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
Analytic database workloads are growing in data size and query complexity. At the same time, compute...
Once exotic, computational accelerators are now commonly available in many computing systems. Graphi...
The memory system has been evolving at a fast pace recently, driven by the emergence of large-scale ...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
GPUs are increasingly adopted for large-scale database processing, where data accesses represent the...
Big Data applications are trivially parallelizable because they typically consist of simple and stra...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
We are in the computing era of super-zetta data bytes (a.k.a. Big Data). Big Data is critical to dev...
Algorithms for processing large, unstructured data sets have shown great promise in implementations ...
2018-08-02Recent exponential growth of the data sets size demanded by modern big data applications r...
In-memory big-data processing is rapidly emerging as a promising solution for large-scale data analy...
© 2020 Association for Computing Machinery. There has been significant amount of excitement and rece...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
Analytic database workloads are growing in data size and query complexity. At the same time, compute...
Once exotic, computational accelerators are now commonly available in many computing systems. Graphi...
The memory system has been evolving at a fast pace recently, driven by the emergence of large-scale ...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...