Towards Eliminating Random I/O in Hash Joins

Ming-Ling Lo
Chinya V. Ravishankar

Publication date

January 1996

Abstract

The widening performance gap between CPU and disk is significant for hash join performance. Most current hash join methods try to reduce the volume of data transferred between memory and disk. In this paper, we try to reduce hash-join times by reducing random I/O. We study how current algorithms incur random I/O, and propose a new hash join method, Seq + , that converts much of the random I/O to sequential I/O. Seq + uses a new organization for hash buckets on disk, and larger input and output buffer sizes. We introduce the technique of batch writes to reduce the bucket-write cost, and the concepts of write- and readgroups of hash buckets to reduce the bucket-read cost. We derive a cost model for our method, and present formulas for cho...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Towards Eliminating Random I/O in Hash Joins

Abstract

Extracted data

Towards Eliminating Random I/O in Hash Joins

Abstract

Extracted data

Related items

Related items