International audienceSmall files are known to pose major performance challenges for file systems. Yet, such workloads are increasingly common in a number of Big Data Analytics workflows or large-scale HPC simulations. These challenges are mainly caused by the common architecture of most state-of-the-art file systems needing one or multiple metadata requests before being able to read from a file. Small input file size causes the overhead of this metadata management to gain relative importance as the size of each file decreases. In this paper we propose a set of techniques leveraging consistent hashing and dynamic metadata replication to significantly reduce this metadata overhead. We implement such techniques inside a new file system named ...
Abstract Cloud computing applications require a scalable, elastic and fault tol-erant storage system...
HDFS faces several issues when it comes to handling a large number of small files. These issues are ...
The Hadoop Distributed File System (HDFS) scales to store tens of petabytes of data despite the fact...
International audienceSmall files are known to pose major performance challenges for file systems. Y...
The growing size of modern storage systems is expected to achieve and exceed billions of objects, ma...
This paper proposes architectural refinements, server-driven metadata prefetching and namespace flat...
Parallel file systems are often characterized by a layered architecture that decouples metadata mana...
Data-set sizes are growing. New techniques are emerging to organize and analyze these data-sets. The...
Cloud computing applications require a scalable, elastic and fault tolerant storage system. In this ...
Modern parallel and cluster file systems provide highly scalable I/O bandwidth by enabling highly pa...
Existing file systems, even the most scalable systems that store hundreds of petabytes (or more) of ...
Metadata performance scalability is critically important in high-performance computing when accessin...
Global file system namespaces are difficult to scale because of the overheads of POSIX IO metadata m...
New challenges to file systems' metadata performance are imposed by the continuously growing number ...
New challenges to file systems ’ metadata performance are imposed by the continuously growing number...
Abstract Cloud computing applications require a scalable, elastic and fault tol-erant storage system...
HDFS faces several issues when it comes to handling a large number of small files. These issues are ...
The Hadoop Distributed File System (HDFS) scales to store tens of petabytes of data despite the fact...
International audienceSmall files are known to pose major performance challenges for file systems. Y...
The growing size of modern storage systems is expected to achieve and exceed billions of objects, ma...
This paper proposes architectural refinements, server-driven metadata prefetching and namespace flat...
Parallel file systems are often characterized by a layered architecture that decouples metadata mana...
Data-set sizes are growing. New techniques are emerging to organize and analyze these data-sets. The...
Cloud computing applications require a scalable, elastic and fault tolerant storage system. In this ...
Modern parallel and cluster file systems provide highly scalable I/O bandwidth by enabling highly pa...
Existing file systems, even the most scalable systems that store hundreds of petabytes (or more) of ...
Metadata performance scalability is critically important in high-performance computing when accessin...
Global file system namespaces are difficult to scale because of the overheads of POSIX IO metadata m...
New challenges to file systems' metadata performance are imposed by the continuously growing number ...
New challenges to file systems ’ metadata performance are imposed by the continuously growing number...
Abstract Cloud computing applications require a scalable, elastic and fault tol-erant storage system...
HDFS faces several issues when it comes to handling a large number of small files. These issues are ...
The Hadoop Distributed File System (HDFS) scales to store tens of petabytes of data despite the fact...