Gracefully recovering from software and hardware faults is important to ensuring highly reliable and available systems. Operating systems have privileged access to all aspects of system operation, thus a fault related to them is able to affect the entire system. Existing approaches to operating system recovery either do not protect the entire system or require a completely new operating system design. This dissertation presents a new approach to fault recovery in operating systems called Recovery Domains. This approach allows recovery from unanticipated faults in commodity operating systems. Recovery is organized around the concept of a dynamic request. Operating system entry points initiate requests to perform some action. System...
Recovery is an essential part of databases and most computer systems, because it enables a system to...
Much research has gone into making operating systems more amenable to recovery and more resilient to...
The modeling and design of a fault-tolerant multiprocessor system is addressed in this dissertation....
Gracefully recovering from software and hardware faults is important to ensuring highly reliable an...
User applications and data in volatile memory are usually lost when an operating system crashes beca...
This study focuses on how to confine error recovery to the immediate environment of a failed computa...
We present an in-depth analysis of the crash-recovery problem and propose a novel approach to recove...
Operating systems often manage critical infrastructures where failures can have serious consequences...
We present a new technique that enables software recovery in legacy applications by retrofitting exc...
In this paper we present a recovery-conscious framework for improving the fault resiliency and reco...
Despite many decades of research, the management of errors in a live operating system remains a chal...
An experimental design study was done to investigate three research questions: (1) Can a software sy...
This paper presents a recovery mechanism for memoryresident databases. It uses some stable memory an...
Dependability is becoming a requirement in an increasing number of domains, including those that wer...
Abstract Autonomic software recovery enables software to automatically detect and recover software f...
Recovery is an essential part of databases and most computer systems, because it enables a system to...
Much research has gone into making operating systems more amenable to recovery and more resilient to...
The modeling and design of a fault-tolerant multiprocessor system is addressed in this dissertation....
Gracefully recovering from software and hardware faults is important to ensuring highly reliable an...
User applications and data in volatile memory are usually lost when an operating system crashes beca...
This study focuses on how to confine error recovery to the immediate environment of a failed computa...
We present an in-depth analysis of the crash-recovery problem and propose a novel approach to recove...
Operating systems often manage critical infrastructures where failures can have serious consequences...
We present a new technique that enables software recovery in legacy applications by retrofitting exc...
In this paper we present a recovery-conscious framework for improving the fault resiliency and reco...
Despite many decades of research, the management of errors in a live operating system remains a chal...
An experimental design study was done to investigate three research questions: (1) Can a software sy...
This paper presents a recovery mechanism for memoryresident databases. It uses some stable memory an...
Dependability is becoming a requirement in an increasing number of domains, including those that wer...
Abstract Autonomic software recovery enables software to automatically detect and recover software f...
Recovery is an essential part of databases and most computer systems, because it enables a system to...
Much research has gone into making operating systems more amenable to recovery and more resilient to...
The modeling and design of a fault-tolerant multiprocessor system is addressed in this dissertation....