We formulate a conceptual model for white-box compression, which represents the logical columns in tabular data as an openly defined function over some actually stored physical columns. Each block of data should thus go accompanied by a header that describes this functional mapping. Because these compression functions are openly defined, database systems can exploit them using query optimization and during execution, enabling e.g. better filter predicate pushdown. In addition, we show that white-box compression is able to identify a broad variety of new opportunities for compression, leading to much better compression factors. These opportunities are identified using an automatic learning process that learns the functions from the data. We prov...
While a variety of lossy compression schemes have been developed for certain forms of digital data (...
Decision-support applications in emerging environments require that SQL query results or intermediat...
A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengt...
We formulate a conceptual model for white-box compression, which represents the logical columns in t...
Modern columnar databases heavily use compression to reduce memory footprint and boost query executi...
Columnar databases have dominated the data analysis market for their superior performance in query p...
Data compression is one way to gain better performance from a database. Compression is typically ach...
Column-oriented database system architectures invite a reevaluation of how and when data in database...
We study the problem of compressing massive tables. We devise a novel compression paradigm--training...
Data Compression is today essential for a wide range of applications: for example Internet and the W...
through this study, we propose two algorithms. The first algorithm describes the concept of compress...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
Compression can sometimes improve performance by making more of the data available to the processors...
International audienceLossy compression algorithms trade bits for quality, aiming at reducing as muc...
Data compression techniques can improve information system performance by reducing the size of a dat...
While a variety of lossy compression schemes have been developed for certain forms of digital data (...
Decision-support applications in emerging environments require that SQL query results or intermediat...
A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengt...
We formulate a conceptual model for white-box compression, which represents the logical columns in t...
Modern columnar databases heavily use compression to reduce memory footprint and boost query executi...
Columnar databases have dominated the data analysis market for their superior performance in query p...
Data compression is one way to gain better performance from a database. Compression is typically ach...
Column-oriented database system architectures invite a reevaluation of how and when data in database...
We study the problem of compressing massive tables. We devise a novel compression paradigm--training...
Data Compression is today essential for a wide range of applications: for example Internet and the W...
through this study, we propose two algorithms. The first algorithm describes the concept of compress...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
Compression can sometimes improve performance by making more of the data available to the processors...
International audienceLossy compression algorithms trade bits for quality, aiming at reducing as muc...
Data compression techniques can improve information system performance by reducing the size of a dat...
While a variety of lossy compression schemes have been developed for certain forms of digital data (...
Decision-support applications in emerging environments require that SQL query results or intermediat...
A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengt...