To enable data locality, we have developed an approach of adding coordinated caches to existing compute clusters. Since the data stored locally is volatile and selected dynamically, only a fraction of local storage space is required. Our approach allows to freely select the degree at which data locality is provided. It may be used to work in conjunction with large network bandwidths, providing only highly used data to reduce peak loads. Alternatively, local storage may be scaled up to perform data analysis even with low network bandwidth. To prove the applicability of our approach, we have developed a prototype implementing all required functionality. It integrates seamlessly into batch systems, requiring practically no adjustments by users...
International audienceThe cost of data movement has always been an important concern in high perform...
Current market tendencies show the need of storing and processing rapidly growing amounts of data. ...
International audienceLarge-scale scientific experiments increasingly rely on geo-distributed clouds...
The heavily increasing amount of data produced by current experiments in high energy particle physic...
Data-intensive end-user analyses in high energy physics require high data throughput to reach short ...
Modern data processing increasingly relies on data locality for performance and scalability, whereas...
With the second run period of the LHC, high energy physics collaborations will have to face increasi...
High throughput and short turnaround cycles are core requirements for efficient processing of data-i...
International audienceModern computing platforms are increasingly complex, with multiple cores, shar...
Large-scale scientific experiments increasingly rely on geo- distributed clouds to serve relevant da...
Parallel computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA ...
Large scale computing infrastructures have been widely developed with the core objective of providin...
Extensive data analysis has become the enabler for diagnostics and decision making in many modern sy...
Big data has revolutionized science and technology leading to the transformation of our societies. H...
Recent years have witnessed the prevalence of MapReduce-based systems, e.g., Apache Hadoop, in large...
International audienceThe cost of data movement has always been an important concern in high perform...
Current market tendencies show the need of storing and processing rapidly growing amounts of data. ...
International audienceLarge-scale scientific experiments increasingly rely on geo-distributed clouds...
The heavily increasing amount of data produced by current experiments in high energy particle physic...
Data-intensive end-user analyses in high energy physics require high data throughput to reach short ...
Modern data processing increasingly relies on data locality for performance and scalability, whereas...
With the second run period of the LHC, high energy physics collaborations will have to face increasi...
High throughput and short turnaround cycles are core requirements for efficient processing of data-i...
International audienceModern computing platforms are increasingly complex, with multiple cores, shar...
Large-scale scientific experiments increasingly rely on geo- distributed clouds to serve relevant da...
Parallel computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA ...
Large scale computing infrastructures have been widely developed with the core objective of providin...
Extensive data analysis has become the enabler for diagnostics and decision making in many modern sy...
Big data has revolutionized science and technology leading to the transformation of our societies. H...
Recent years have witnessed the prevalence of MapReduce-based systems, e.g., Apache Hadoop, in large...
International audienceThe cost of data movement has always been an important concern in high perform...
Current market tendencies show the need of storing and processing rapidly growing amounts of data. ...
International audienceLarge-scale scientific experiments increasingly rely on geo-distributed clouds...