As parallel and distributed systems scale to hundreds of thousands of cores and beyond, fault tolerance becomes increasingly important -- particularly on systems with limited I/O capacity and bandwidth. Error correcting codes (ECCs) are used in communication systems where errors arise when bits are corrupted silently in a message. Error correcting codes can detect and correct erroneous bits. Erasure codes, an instance of error correcting codes that deal with data erasures, are widely used in storage systems. An erasure code addsredundancy to the data to tolerate erasures. In this thesis, erasure coded computations are proposed as a novel approach to dealing with processor faults in parallel and distributed systems. We first give a brie...
The advent of the information age has bestowed upon us three challenges related to the way we deal w...
Abstract—P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a...
Abstract—Technology scaling advancement coupled with op-erational and environmental effects make emb...
Some emerging classes of distributed computing systems, such peer-to-peer or grid computing computin...
As the desire of scientists to perform ever larger computations drives the size of today’s high perf...
As the number of processors in today’s parallel systems continues to grow, the mean-time-to-failure ...
Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fund...
We live in age of data ubiquity. Even the most conservative estimates predict exponential growth in ...
Distributed storage systems store a substantial amount of data on many commodity servers. As servers...
An error correcting code is a technique of adding extra information to a message such that it can be...
Version du 30 juin 2010Content distribution systems are developping quickly. In these systems, in ad...
This thesis investigates the role of error-correcting codes in Distributed and Pervasive Computing....
The amount of digital data is rapidly growing. There is an increasing use of a wide range of compute...
With the internet growing exponentially, the amount of information stored digitally becomes enormous...
We present a new approach to fault tolerance for High Performance Computing system. Our approach is ...
The advent of the information age has bestowed upon us three challenges related to the way we deal w...
Abstract—P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a...
Abstract—Technology scaling advancement coupled with op-erational and environmental effects make emb...
Some emerging classes of distributed computing systems, such peer-to-peer or grid computing computin...
As the desire of scientists to perform ever larger computations drives the size of today’s high perf...
As the number of processors in today’s parallel systems continues to grow, the mean-time-to-failure ...
Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fund...
We live in age of data ubiquity. Even the most conservative estimates predict exponential growth in ...
Distributed storage systems store a substantial amount of data on many commodity servers. As servers...
An error correcting code is a technique of adding extra information to a message such that it can be...
Version du 30 juin 2010Content distribution systems are developping quickly. In these systems, in ad...
This thesis investigates the role of error-correcting codes in Distributed and Pervasive Computing....
The amount of digital data is rapidly growing. There is an increasing use of a wide range of compute...
With the internet growing exponentially, the amount of information stored digitally becomes enormous...
We present a new approach to fault tolerance for High Performance Computing system. Our approach is ...
The advent of the information age has bestowed upon us three challenges related to the way we deal w...
Abstract—P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a...
Abstract—Technology scaling advancement coupled with op-erational and environmental effects make emb...