Data processing systems often leverage vector instructions to achieve higher performance. When applying vector instructions, an often overlooked data structure is the hash table, even though it is fundamental in data processing systems for operations such as indexing, aggregating, and joining. In this paper, we characterize and evaluate three fundamental vectorized hashing schemes, vectorized linear probing (VLP), vectorized fingerprinting (VFP), and bucket-based comparison (BBC). We implement these hashing schemes on the x86, ARM, and Power CPU architectures, as modern database systems must provide efficient implementations for multiple platforms due to the continuously increasing hardware heterogeneity. We present various implementation v...
Hashing is a well-known and widely used technique for providing O(1) access to large files on second...
Fast concurrent hash tables are an increasingly important building block as we scale systems to grea...
Hashing has yet to be widely accepted as a component of hard real-time systems and hardware implemen...
In recent years, the increasing demand for high-performance analytics on big data has led the resear...
Most computer programs or applications need fast data structures. The performance of a data structur...
Hashing is one of the fundamental techniques used to implement query processing operators such as gr...
Abstract—High-performance analytical data processing sys-tems often run on servers with large amount...
We revisit the problem of building static hash tables on the GPU and present an efficient implementa...
Abstract—Existing main-memory hash join algorithms for multi-core can be classified into two camps. ...
A number of recent papers have considered the influence of modern computer memory hierarchies on the...
Extracting valuable information from the rapidly growing field of Big Data faces serious performance...
Abstract—Hashing is critical for high performance computer architecture. Hashing is used extensively...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
In this paper, we conducted empirical experiments to study the performance of hashing with a large s...
Linear Hashing is a dynamically updateable disk-based index structure which implements a hashing sch...
Hashing is a well-known and widely used technique for providing O(1) access to large files on second...
Fast concurrent hash tables are an increasingly important building block as we scale systems to grea...
Hashing has yet to be widely accepted as a component of hard real-time systems and hardware implemen...
In recent years, the increasing demand for high-performance analytics on big data has led the resear...
Most computer programs or applications need fast data structures. The performance of a data structur...
Hashing is one of the fundamental techniques used to implement query processing operators such as gr...
Abstract—High-performance analytical data processing sys-tems often run on servers with large amount...
We revisit the problem of building static hash tables on the GPU and present an efficient implementa...
Abstract—Existing main-memory hash join algorithms for multi-core can be classified into two camps. ...
A number of recent papers have considered the influence of modern computer memory hierarchies on the...
Extracting valuable information from the rapidly growing field of Big Data faces serious performance...
Abstract—Hashing is critical for high performance computer architecture. Hashing is used extensively...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
In this paper, we conducted empirical experiments to study the performance of hashing with a large s...
Linear Hashing is a dynamically updateable disk-based index structure which implements a hashing sch...
Hashing is a well-known and widely used technique for providing O(1) access to large files on second...
Fast concurrent hash tables are an increasingly important building block as we scale systems to grea...
Hashing has yet to be widely accepted as a component of hard real-time systems and hardware implemen...