Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress natural language texts. With compression ratios around 30-35%, they allow fast direct searching of compressed text. In this article, we reveal that these compressors have even more benefits. We show that most of the state-of-the-art compressors benefit from compressing not the original text, but the compressed representation obtained by a word-based byte-oriented statistical compressor. For example, p7zip with a dense-coding preprocessing achieves even better compression ratios and much faster compression than p7zip alone. We reach compression ratios below 17% in typical large English texts, which was obtained only by the slow prediction by p...
We address the problem of adaptive compression of natural language text, focusing on the case where ...
AbstractIn this paper we present the adaptation of a compression technique, specially designed to co...
An algorithm for very efficient compression of a set of natural language text files is presented. No...
Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress ...
Semistatic word-based byte-oriented compression codes are known to be attractive alternatives to com...
Semistatic byte-oriented word-based compression codes have been shown to be an attractive alternativ...
Full-text indexes provide fast substring search over large text collections. A serious problem of th...
This work presents (s, c)-Dense Code, a new method for compressing natural language texts. This tec...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
[Abstract] Text databases are growing in the last years due to the widespread use of digital librar...
Abstract. We present a technique to build an index based on sux arrays for compressed texts. We also...
This thesis deals with space-efficient algorithms to compress and index texts. The aim of compressio...
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes ba...
We design two compressed data structures for the full-text indexing problem that support efficient s...
Compressed text (self-)indexes have matured up to a point where they can replace a text by a data s...
We address the problem of adaptive compression of natural language text, focusing on the case where ...
AbstractIn this paper we present the adaptation of a compression technique, specially designed to co...
An algorithm for very efficient compression of a set of natural language text files is presented. No...
Semistatic word-based byte-oriented compressors are known to be attractive alternatives to compress ...
Semistatic word-based byte-oriented compression codes are known to be attractive alternatives to com...
Semistatic byte-oriented word-based compression codes have been shown to be an attractive alternativ...
Full-text indexes provide fast substring search over large text collections. A serious problem of th...
This work presents (s, c)-Dense Code, a new method for compressing natural language texts. This tec...
We report on a new and improved version of high-order entropy-compressed suffix arrays, which has th...
[Abstract] Text databases are growing in the last years due to the widespread use of digital librar...
Abstract. We present a technique to build an index based on sux arrays for compressed texts. We also...
This thesis deals with space-efficient algorithms to compress and index texts. The aim of compressio...
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes ba...
We design two compressed data structures for the full-text indexing problem that support efficient s...
Compressed text (self-)indexes have matured up to a point where they can replace a text by a data s...
We address the problem of adaptive compression of natural language text, focusing on the case where ...
AbstractIn this paper we present the adaptation of a compression technique, specially designed to co...
An algorithm for very efficient compression of a set of natural language text files is presented. No...