We report our experiences porting Spark to large production HPC systems. While Spark performance in a data center installation (with local disks) is dominated by the network, our results show that file system metadata access latency can dominate in a HPC installation using Lustre: it determines single node performance up to 4× slower than a typical workstation. We evaluate a combination of software techniques and hardware configurations designed to address this problem. For example, on the software side we develop a file pooling layer able to improve per node performance up to 2.8×. On the hardware side we evaluate a system with a large NVRAM buffer between compute nodes and the backend Lustre file system: this improves scaling at the expen...
Lustre is a GPLed cluster file system for Linux that is currently being tested on three of the world...
International audienceNowadays, power and energy consumption are of paramount importance. Further, r...
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), ...
We report our experiences porting Spark to large production HPC systems. While Spark performance in ...
Abstract—In this paper we present a framework to enable data-intensive Spark workloads on MareNostru...
International audienceBig Data analytics frameworks (e.g., Apache Hadoop and Apache Spark) have been...
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a peta...
\ua9 2014 IEEE. The increasing demands of big data applications have led researchers and practitione...
Energy consumption is by far the most important contributor to HPC cluster operational costs, and it...
Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, t...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
Abstract—MapReduce has emerged as a popular and easy-to-use programming model for numerous organizat...
In the last decades, high-performance large-scale systems have been a fundamental tool for scientifi...
Best paper award.International audienceSpark is being successfully used for big data parallel proces...
Lustre is a GPLed cluster file system for Linux that is currently being tested on three of the world...
International audienceNowadays, power and energy consumption are of paramount importance. Further, r...
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), ...
We report our experiences porting Spark to large production HPC systems. While Spark performance in ...
Abstract—In this paper we present a framework to enable data-intensive Spark workloads on MareNostru...
International audienceBig Data analytics frameworks (e.g., Apache Hadoop and Apache Spark) have been...
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a peta...
\ua9 2014 IEEE. The increasing demands of big data applications have led researchers and practitione...
Energy consumption is by far the most important contributor to HPC cluster operational costs, and it...
Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, t...
The sheer increase in the volume of data over the last decade has triggered research in cluster comp...
Sheer increase in volume of data over the last decade has triggered research in cluster computing fr...
Abstract—MapReduce has emerged as a popular and easy-to-use programming model for numerous organizat...
In the last decades, high-performance large-scale systems have been a fundamental tool for scientifi...
Best paper award.International audienceSpark is being successfully used for big data parallel proces...
Lustre is a GPLed cluster file system for Linux that is currently being tested on three of the world...
International audienceNowadays, power and energy consumption are of paramount importance. Further, r...
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), ...