Chunking algorithms are often used by storage solutions in order to factorize and deduplicate data. Such algorithms make the assumption that the consecutive versions of a file share a lot of similarities. Unfortunately, file formats often use compression algorithms and minor changes have the potential to completely reorganize the internal layout of a file. In consequence, chunking algorithms become less efficient in factorizing data. In this paper, we evaluate content-defined chunking with file formats that use data compression. We show how content-defined chunking algorithms can take the file format into account. Finally, we demonstrate that adding file format knowledge to a popular chunking algorithm significantly improves its performance
Data Compression is today essential for a wide range of applications: for example Internet and the W...
International audienceThis paper attempts to evaluate the capacity of immediate memory to cope with ...
International audienceSequence data structures, i.e., data structures that provide operations on an ...
Data deduplication techniques are often used by cloud storage systems to reduce network bandwidth an...
Abstract—Data compression could ameliorate the I/O pressure of scientific applications on high-perfo...
AbstractWhen a file is to be transmitted from a sender to a recipient and when the latter already ha...
Deduplication is an efficient data reduction technique, and it is used to mitigate the problem of hu...
International audienceMany modern, large-scale storage solutions offer deduplication, which can achi...
Data deduplication has become a populartechnology for reducing the amount of storagespace necessary ...
* student authors Disk-based backup storage system is utilized widely, and data deduplication is bec...
Abstract –Duplicate Elimination (DE) is a specialized data compression technique for eliminating dup...
In order to achieve energy saving and reduce the total cost of ownership, green storage has become t...
When exposed to perceptual and motor sequences, people are able to gradually identify patterns withi...
Memory for verbal material improves when words form familiar chunks. But how does the improvement du...
Abstract. Given the size of today’s data, out-of-core visualization tech-niques are increasingly imp...
Data Compression is today essential for a wide range of applications: for example Internet and the W...
International audienceThis paper attempts to evaluate the capacity of immediate memory to cope with ...
International audienceSequence data structures, i.e., data structures that provide operations on an ...
Data deduplication techniques are often used by cloud storage systems to reduce network bandwidth an...
Abstract—Data compression could ameliorate the I/O pressure of scientific applications on high-perfo...
AbstractWhen a file is to be transmitted from a sender to a recipient and when the latter already ha...
Deduplication is an efficient data reduction technique, and it is used to mitigate the problem of hu...
International audienceMany modern, large-scale storage solutions offer deduplication, which can achi...
Data deduplication has become a populartechnology for reducing the amount of storagespace necessary ...
* student authors Disk-based backup storage system is utilized widely, and data deduplication is bec...
Abstract –Duplicate Elimination (DE) is a specialized data compression technique for eliminating dup...
In order to achieve energy saving and reduce the total cost of ownership, green storage has become t...
When exposed to perceptual and motor sequences, people are able to gradually identify patterns withi...
Memory for verbal material improves when words form familiar chunks. But how does the improvement du...
Abstract. Given the size of today’s data, out-of-core visualization tech-niques are increasingly imp...
Data Compression is today essential for a wide range of applications: for example Internet and the W...
International audienceThis paper attempts to evaluate the capacity of immediate memory to cope with ...
International audienceSequence data structures, i.e., data structures that provide operations on an ...