Sorting database tables before compressing them improves the compression rate. Can we do better than the lexicographical order? For minimizing the number of runs in a run-length encoding compression scheme, the best approaches to row-ordering are derived from traveling salesman heuristics, although there is a significant trade-off between running time and compression. A new heuristic, Multiple Lists, which is a variant on Nearest Neighbor that trades off compression for a major running-time speedup, is a good option for very large tables. However, for some compression schemes, it is more important to generate long runs rather than few runs. For this case, another novel heuristic, Vortex, is promising. We find that we can improve run-length ...
Sorting and searching are large parts of database query processing, e.g., in the forms of index crea...
Sorting is a classic problem and one to which many others reduce easily. In the streaming model, how...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
Sorting database tables before compressing them improves the compression rate. Can we do better than...
Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding t...
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate...
We give experimental evidence for the benefits of order preserving compression in sorting algorithms...
Bitmap indexes are frequently used to index multidimensional data. They rely mostly on sequential in...
.<F3.733e+05> As no database exists without indexes, no index implementation exists without or...
Many scientific applications generate massive volumes of data through observations or computer simul...
We study the problem of compressing massive tables within the partition-training paradigm introduced...
Many scientific applications generate massive volumes of data through observations or computer simu...
Abstract. Lexicographical sorting is a fundamental problem with applications to contingency tables, ...
We study the problem of compressing massive tables within the partition-training paradigm introduced...
A bitmap index is a type of database index in which querying is implemented using logical operations...
Sorting and searching are large parts of database query processing, e.g., in the forms of index crea...
Sorting is a classic problem and one to which many others reduce easily. In the streaming model, how...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...
Sorting database tables before compressing them improves the compression rate. Can we do better than...
Column-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding t...
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage. To accelerate...
We give experimental evidence for the benefits of order preserving compression in sorting algorithms...
Bitmap indexes are frequently used to index multidimensional data. They rely mostly on sequential in...
.<F3.733e+05> As no database exists without indexes, no index implementation exists without or...
Many scientific applications generate massive volumes of data through observations or computer simul...
We study the problem of compressing massive tables within the partition-training paradigm introduced...
Many scientific applications generate massive volumes of data through observations or computer simu...
Abstract. Lexicographical sorting is a fundamental problem with applications to contingency tables, ...
We study the problem of compressing massive tables within the partition-training paradigm introduced...
A bitmap index is a type of database index in which querying is implemented using logical operations...
Sorting and searching are large parts of database query processing, e.g., in the forms of index crea...
Sorting is a classic problem and one to which many others reduce easily. In the streaming model, how...
Compression reduces both the size of indexes and the time needed to evaluate queries. In this paper,...