GPUs are increasingly adopted for large-scale database processing, where data accesses represent the major part of the computation. If the data accesses are irregular, like hash table accesses or random sampling, the GPU performance can suffer. Especially when scaling such accesses beyond 2GB of data, a performance decrease of an order of magnitude is encountered. This paper analyzes the source of the slowdown through extensive micro-benchmarking, attributing the root cause to the Translation Lookaside Buffer (TLB). Using the micro-benchmarks, the TLB hierarchy and structure are fully analyzed on two different GPU architectures, identifying never-before-published TLB sizes that can be used for efficient large-scale application tuning. Based...
Application virtual address space is divided into pages, each requiring a virtual-to-physical transl...
International audienceIn this work, we investigate the global memory access mechanism on recent GPUs...
The recent advent of high-throughput sequencing machines producing big amounts of short reads has bo...
GPUs are increasingly adopted for large-scale database processing, where data accesses represent the...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
Big Data applications are trivially parallelizable because they typically consist of simple and stra...
2018-08-02Recent exponential growth of the data sets size demanded by modern big data applications r...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
textGraphics Processing Units (GPUs) have become a popular platform for executing general purpose (i...
AbstractSpatial databases are used in a wide variety of real-world applications, such as land survey...
GPU has been considered as one of the next-generation platforms for real-time query processing datab...
We are in the computing era of super-zetta data bytes (a.k.a. Big Data). Big Data is critical to dev...
Application virtual address space is divided into pages, each requiring a virtual-to-physical transl...
International audienceIn this work, we investigate the global memory access mechanism on recent GPUs...
The recent advent of high-throughput sequencing machines producing big amounts of short reads has bo...
GPUs are increasingly adopted for large-scale database processing, where data accesses represent the...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
During the last two decades, computer hardware has experienced remarkable developments. Especially C...
In the past decade, advances in speed of commodity CPUs have far out-paced advances in memory latenc...
Big Data applications are trivially parallelizable because they typically consist of simple and stra...
2018-08-02Recent exponential growth of the data sets size demanded by modern big data applications r...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
textGraphics Processing Units (GPUs) have become a popular platform for executing general purpose (i...
AbstractSpatial databases are used in a wide variety of real-world applications, such as land survey...
GPU has been considered as one of the next-generation platforms for real-time query processing datab...
We are in the computing era of super-zetta data bytes (a.k.a. Big Data). Big Data is critical to dev...
Application virtual address space is divided into pages, each requiring a virtual-to-physical transl...
International audienceIn this work, we investigate the global memory access mechanism on recent GPUs...
The recent advent of high-throughput sequencing machines producing big amounts of short reads has bo...