Analytics is moving to the cloud and data is moving into data lakes. These reside on object storage services like S3 and enable seamless data sharing and system interoperability. To support this, many systems build on open storage formats like Apache Parquet. However, these formats are not optimized for remotely-accessed data lakes and today's high-throughput networks. Inefficient decompression makes scans CPU-bound and thus increases query time and cost. With this work we present BtrBlocks, an open columnar storage format designed for data lakes. BtrBlocks uses a set of lightweight encoding schemes, achieving fast and efficient decompression and high compression ratios
The last few years have seen an exponential increase, driven by many disparate fields such as big da...
Many database applications make extensive use of bitmap indexing schemes. In this paper, we study h...
Growing user expectations of anywhere, anytime access to information require new types of data tran...
Columnar databases have dominated the data analysis market for their superior performance in query p...
Data compression is one way to gain better performance from a database. Compression is typically ach...
Column-oriented database system architectures invite a reevaluation of how and when data in database...
Modern columnar databases heavily use compression to reduce memory footprint and boost query executi...
Nowadays, massive amounts of point cloud data can be collected thanks to advances in data acquisitio...
Modern in-memory databases are typically used for high-performance workloads, therefore they have to...
Column oriented database have continued to grow over the past few decades. C-Store, Vertica Monet DB...
Abstract—Compute cycles in high performance systems are increasing at a much faster pace than both s...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
We argue for a richer view of the space of lightweight compression schemes for columnar DBMSes: We d...
Compressed bitmap indexes are used in databases and search engines. Many bitmap compression techniq...
The rapid growth of fast analytics systems, that require data processing in memory, makes memory cap...
The last few years have seen an exponential increase, driven by many disparate fields such as big da...
Many database applications make extensive use of bitmap indexing schemes. In this paper, we study h...
Growing user expectations of anywhere, anytime access to information require new types of data tran...
Columnar databases have dominated the data analysis market for their superior performance in query p...
Data compression is one way to gain better performance from a database. Compression is typically ach...
Column-oriented database system architectures invite a reevaluation of how and when data in database...
Modern columnar databases heavily use compression to reduce memory footprint and boost query executi...
Nowadays, massive amounts of point cloud data can be collected thanks to advances in data acquisitio...
Modern in-memory databases are typically used for high-performance workloads, therefore they have to...
Column oriented database have continued to grow over the past few decades. C-Store, Vertica Monet DB...
Abstract—Compute cycles in high performance systems are increasing at a much faster pace than both s...
Domain encoding is a common technique to compress the columns of a column store and to accelerate ma...
We argue for a richer view of the space of lightweight compression schemes for columnar DBMSes: We d...
Compressed bitmap indexes are used in databases and search engines. Many bitmap compression techniq...
The rapid growth of fast analytics systems, that require data processing in memory, makes memory cap...
The last few years have seen an exponential increase, driven by many disparate fields such as big da...
Many database applications make extensive use of bitmap indexing schemes. In this paper, we study h...
Growing user expectations of anywhere, anytime access to information require new types of data tran...