The key idea behind Inspector Joins is that during the I/O partitioning phase of a hash-based join, we have the opportunity to look at the actual data itself and then use this knowledge in two ways: (1) to create specialized indexes, specific to the given query on the given data, for optimizing the CPU cache performance of the subsequent join phase of the algorithm, and (2) to decide which join phase algorithm best suits this specific query. We show how inspector joins, employing novel statistics and specialized indexes, match or exceed the performance of state-of-the-art cache-friendly hash join algorithms. For example, when run on eight or more processors, our experiments show that inspector joins offer 1.11.4X speedups over these previou...
: In parallelizing the join operation of database systems, a primary objective is to partition the w...
High-performance analytical data processing systems often run on servers with large amounts of main ...
Traditionally, analytical database engines have used task parallelism provided by modern multisocket...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
The architectural changes introduced with multicore CPUs have triggered a redesign of main-memory jo...
Traditional join algorithms can be categorized into three groups: hash-based join, sort-merge join, ...
Abstract—The architectural changes introduced with multi-core CPUs have triggered a redesign of main...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
Hash join algorithms suffer from extensive CPU cache stalls. This paper shows that the standard hash...
Two new algorithms, "Jive-join" and "Slam-join," are proposed for computing the ...
Index join performance is determined by the efficiency of the lookup operation on the involved index...
Minimizing both the response time to produce the first few thousand results and the overall executi...
We present new hash tables for joins, and a hash join based on them, that consumes far less memory a...
Hash joins combine massive relations in data warehouses, decision support systems, and scientific da...
: In parallelizing the join operation of database systems, a primary objective is to partition the w...
High-performance analytical data processing systems often run on servers with large amounts of main ...
Traditionally, analytical database engines have used task parallelism provided by modern multisocket...
The hash join algorithm family is one of the leading techniques for equi-join performance evaluation...
The architectural changes introduced with multicore CPUs have triggered a redesign of main-memory jo...
Traditional join algorithms can be categorized into three groups: hash-based join, sort-merge join, ...
Abstract—The architectural changes introduced with multi-core CPUs have triggered a redesign of main...
Join is an important database operation. As computer architectures evolve, the best join algorithm m...
In the past decade, the exponential growth in commodity CPUs speed has far outpaced advances in memo...
Hash join algorithms suffer from extensive CPU cache stalls. This paper shows that the standard hash...
Two new algorithms, "Jive-join" and "Slam-join," are proposed for computing the ...
Index join performance is determined by the efficiency of the lookup operation on the involved index...
Minimizing both the response time to produce the first few thousand results and the overall executi...
We present new hash tables for joins, and a hash join based on them, that consumes far less memory a...
Hash joins combine massive relations in data warehouses, decision support systems, and scientific da...
: In parallelizing the join operation of database systems, a primary objective is to partition the w...
High-performance analytical data processing systems often run on servers with large amounts of main ...
Traditionally, analytical database engines have used task parallelism provided by modern multisocket...