Existing big-data systems (e.g., Hadoop/MapReduce) do not expose management of shared storage I/O resources. As such, application’s performance may degrade in unpre-dictable ways under I/O contention, even with fair sharing of computing resources. This paper proposes IBIS, a new Interposed Big-data I/O Scheduler, to provide performance differentiation for competing applications ’ I/Os in a shared MapReduce-type big-data system. IBIS is implemented in Hadoop by interposing HDFS I/Os and use an SFQ-based proportional-sharing algorithm. Experiments show that the IBIS provides strong performance isolation for one applica-tion against another highly I/O-intensive application. IBIS also enforces good proportional sharing of the global band-width ...
Whereas traditional scientific applications are computationally intensive, recent applications requi...
Abstract — The Hadoop Distributed File System (HDFS) is designed to store large data sets reliably a...
Hadoop is a popular open-source implementation of MapReduce for the analysis of large datasets. To m...
Computing systems are becoming increasingly data-intensive because of the explosion of data and the ...
Abstract The performance gap between compute and storage is fairly considerable. This results in a m...
Big data has entered every corner of the fields of science and engineering and becomes a part of hum...
Data generated in the past few years cannot be efficiently manipulated with the traditional way of s...
Abstract—Steady growth in storage and processing capabilities has led to the accumulation of large-s...
Abstract: The term ‘Big Data ’ describes innovative techniques and technologies to capture, store, d...
As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamenta...
As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamenta...
International audienceUnmatched computation and storage performance in new HPC systems have led to a...
Abstract — As a leading framework for data intensive computing, MapReduce has gained enormous popula...
Distributed applications, especially the ones being I/O intensive, often access the storage subsyste...
For large scale parallel applications Mapreduce is a widely used programming model. Mapreduce is an ...
Whereas traditional scientific applications are computationally intensive, recent applications requi...
Abstract — The Hadoop Distributed File System (HDFS) is designed to store large data sets reliably a...
Hadoop is a popular open-source implementation of MapReduce for the analysis of large datasets. To m...
Computing systems are becoming increasingly data-intensive because of the explosion of data and the ...
Abstract The performance gap between compute and storage is fairly considerable. This results in a m...
Big data has entered every corner of the fields of science and engineering and becomes a part of hum...
Data generated in the past few years cannot be efficiently manipulated with the traditional way of s...
Abstract—Steady growth in storage and processing capabilities has led to the accumulation of large-s...
Abstract: The term ‘Big Data ’ describes innovative techniques and technologies to capture, store, d...
As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamenta...
As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamenta...
International audienceUnmatched computation and storage performance in new HPC systems have led to a...
Abstract — As a leading framework for data intensive computing, MapReduce has gained enormous popula...
Distributed applications, especially the ones being I/O intensive, often access the storage subsyste...
For large scale parallel applications Mapreduce is a widely used programming model. Mapreduce is an ...
Whereas traditional scientific applications are computationally intensive, recent applications requi...
Abstract — The Hadoop Distributed File System (HDFS) is designed to store large data sets reliably a...
Hadoop is a popular open-source implementation of MapReduce for the analysis of large datasets. To m...