The massive scale of current and next-generation massively parallel processing (MPP) systems presents significant challenges related to fault tolerance. For applications that perform periodic checkpoints, the choice of the checkpoint interval, the period between checkpoints, can have a significant impact on the execution time of the application. Finding the optimal checkpoint interval that minimizes the wall clock execution time, has been a subject of research over the last decade. In an environment where there are concurrent applications competing for access to the network and storage resources, in addition to application execution times, contention at these shared resources need to be factored into the process of choosing checkpoint inter...
This short paper deals with parallel scientific applications using non-blocking and periodic co-ordi...
International audienceIn this paper, we design and analyze strategies to replicate the execution of ...
International audienceIn high-performance computing environments, in-put/output (I/O) from various s...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
Since the last decade, computing systems turn to large scale parallel platforms composed of thousand...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
Checkpointing is commonly adopted for enhancing the performance of software applications that operat...
Researchers have mentioned that the three most difficult and growing problems in the future of high-...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
This short paper deals with parallel scientific applications using non-blocking and periodic co-ordi...
International audienceIn this paper, we design and analyze strategies to replicate the execution of ...
International audienceIn high-performance computing environments, in-put/output (I/O) from various s...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
The large scale of current and next-generation massively parallel processing (MPP) systems presents ...
The massive scale of current and next-generation massively parallel processing (MPP) systems present...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
Since the last decade, computing systems turn to large scale parallel platforms composed of thousand...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
International audienceInput/output (I/O) from various sources often contend for scarcely available b...
Checkpointing is commonly adopted for enhancing the performance of software applications that operat...
Researchers have mentioned that the three most difficult and growing problems in the future of high-...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
This report provides an introduction to the design of scheduling algorithms to cope with faults on l...
This short paper deals with parallel scientific applications using non-blocking and periodic co-ordi...
International audienceIn this paper, we design and analyze strategies to replicate the execution of ...
International audienceIn high-performance computing environments, in-put/output (I/O) from various s...