The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer sequences. However, its compression ratio is usually not competitive with other more sophisticated encoders, especially when the integers to be compressed are small that is the typical case for inverted indexes. This paper shows that the compression ratio of Variable-Byte can be improved by 2× by adopting a partitioned representation of the inverted lists. This makes Variable-Byte surprisingly competitive in space with the best bit-aligned encoders, hence disproving the folklore belief that Variable-Byte is space-inefficient for inverted index compression. Despite the significant space savings, we show that our optimization almost comes for fre...
Sorted lists of integers are commonly used in inverted indexes and database systems. They are often ...
The Elias-Fano representation of monotone sequences has been recently applied to the compression of ...
Given a sequence S = s_1 s_2 ... s_n of integers smaller than r = O(polylog(n)), we show how S can b...
The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer se...
The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer se...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
The data structure at the core of large-scale search engines is the inverted index, which is essenti...
Dictionary-based compression schemes provide fast decoding operation, typically at the expense of re...
Inverted indexes are usually represented by dividing posting lists into constant-sized blocks and re...
Inverted indexes are usually represented by dividing posting lists into constant-sized blocks and re...
Arrays of integers are often compressed in search engines. Though there are many ways to compress in...
Enormous datasets are a common occurence today and compressing them is often beneficial. Fast direc...
Compression can sometimes improve performance by making more of the data available to the processors...
Efficient access to the inverted index data structure is a key aspect for a search engine to achieve...
To sustain the tremendous workloads they suffer on a daily basis, Web search engines employ highly c...
Sorted lists of integers are commonly used in inverted indexes and database systems. They are often ...
The Elias-Fano representation of monotone sequences has been recently applied to the compression of ...
Given a sequence S = s_1 s_2 ... s_n of integers smaller than r = O(polylog(n)), we show how S can b...
The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer se...
The ubiquitous Variable-Byte encoding is one of the fastest compressed representation for integer se...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
The data structure at the core of large-scale search engines is the inverted index, which is essenti...
Dictionary-based compression schemes provide fast decoding operation, typically at the expense of re...
Inverted indexes are usually represented by dividing posting lists into constant-sized blocks and re...
Inverted indexes are usually represented by dividing posting lists into constant-sized blocks and re...
Arrays of integers are often compressed in search engines. Though there are many ways to compress in...
Enormous datasets are a common occurence today and compressing them is often beneficial. Fast direc...
Compression can sometimes improve performance by making more of the data available to the processors...
Efficient access to the inverted index data structure is a key aspect for a search engine to achieve...
To sustain the tremendous workloads they suffer on a daily basis, Web search engines employ highly c...
Sorted lists of integers are commonly used in inverted indexes and database systems. They are often ...
The Elias-Fano representation of monotone sequences has been recently applied to the compression of ...
Given a sequence S = s_1 s_2 ... s_n of integers smaller than r = O(polylog(n)), we show how S can b...