We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100 % fill factor, and uses a sparse bitmap with embedded population counts to almost entirely avoid collisions. This bitmap also serves as a Bloom filter for use in multi-table joins. We study the random access characteristics of hash joins, and renew the case for non-partitioned hash joins. We intro-duce a variant of partitioned joins in which only the build is partitioned, but the probe is not, as this is more ef...
Hash join algorithms suffer from extensive CPU cache stalls. This paper shows that the standard hash...
Minimizing both the response time to produce the first few thousand results and the overall executi...
We analyze the costs, and describe the implementation, of three hashed-based join algorithms for a g...
TID hash joins are a simple and memory-efficient method for processing large join queries. They are ...
Hash joins combine massive relations in data warehouses, decision support systems, and scientific da...
The widening performance gap between CPU and disk is significant for hash join performance. Most cur...
In database systems most join algorithms are binary and will only operate on two inputs at a time. ...
Previous work [1] has claimed that the best performing implementation of in-memory hash joins is bas...
Abstract. Driven by the two main hardware trends increasing main memory and massively parallel multi...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
Recently, Haas and Hellerstein proposed the hash ripple join algorithm in the context of online aggr...
In this paper we present HATCH, a novel hash join engine. We follow a new design point which enables...
One of the most prominent ways to evaluate an equi-join is based on hashing. We consider the problem...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
In database systems most join algorithms are binary and will only oper-ate on two inputs at a time. ...
Hash join algorithms suffer from extensive CPU cache stalls. This paper shows that the standard hash...
Minimizing both the response time to produce the first few thousand results and the overall executi...
We analyze the costs, and describe the implementation, of three hashed-based join algorithms for a g...
TID hash joins are a simple and memory-efficient method for processing large join queries. They are ...
Hash joins combine massive relations in data warehouses, decision support systems, and scientific da...
The widening performance gap between CPU and disk is significant for hash join performance. Most cur...
In database systems most join algorithms are binary and will only operate on two inputs at a time. ...
Previous work [1] has claimed that the best performing implementation of in-memory hash joins is bas...
Abstract. Driven by the two main hardware trends increasing main memory and massively parallel multi...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
Recently, Haas and Hellerstein proposed the hash ripple join algorithm in the context of online aggr...
In this paper we present HATCH, a novel hash join engine. We follow a new design point which enables...
One of the most prominent ways to evaluate an equi-join is based on hashing. We consider the problem...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
In database systems most join algorithms are binary and will only oper-ate on two inputs at a time. ...
Hash join algorithms suffer from extensive CPU cache stalls. This paper shows that the standard hash...
Minimizing both the response time to produce the first few thousand results and the overall executi...
We analyze the costs, and describe the implementation, of three hashed-based join algorithms for a g...