Many critical services are nowadays provided by large and complex software systems. However the increasing complexity introduces several sources of non-determinism, which may lead to hang failures: the system appears to be running, but part of its services are perceived as unresponsive. On-line monitoring is the only way to detect and to promptly react to such failures. However, when dealing with Off-The-Shelf based systems, on-line detection can be tricky since instrumentation and log data collection may not be feasible in practice. In this paper, a detection framework to cope with software hangs is proposed. The framework enables the non-intrusive monitoring of complex systems, based on multiple sources of data gathered at the Operating S...
The analysis of monitoring data is extremely valuable for critical computer systems. It allows to ga...
The next generation of software systems in Large-scale Complex Critical Infrastructures (LCCIs) requ...
This work addresses the problem of software fault diagnosis in complex safety critical software syst...
Many critical services are nowadays provided by large and complex software systems. However, the inc...
On-line failure detection is an essential means to control and assess the dependability of complex a...
Software systems employed in critical scenarios are increasingly large and complex. The usage of man...
Abstract—We propose a fault injection framework to assess hang detection facilities within the Linux...
Revealing anomalies at the operating system (OS) level to support online diagnosis activities of com...
The next generation of critical systems, namely complex Critical Infrastructures (LCCIs), require ef...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
This paper proposes an approach to software faults diagnosis in complex fault tolerant systems, enco...
Dependable complex systems often operate under variable and non-stationary conditions, which require...
International audienceModern component frameworks support continuous deployment and simultaneous exe...
The analysis of monitoring data is extremely valuable for critical computer systems. It allows to ga...
The next generation of software systems in Large-scale Complex Critical Infrastructures (LCCIs) requ...
This work addresses the problem of software fault diagnosis in complex safety critical software syst...
Many critical services are nowadays provided by large and complex software systems. However, the inc...
On-line failure detection is an essential means to control and assess the dependability of complex a...
Software systems employed in critical scenarios are increasingly large and complex. The usage of man...
Abstract—We propose a fault injection framework to assess hang detection facilities within the Linux...
Revealing anomalies at the operating system (OS) level to support online diagnosis activities of com...
The next generation of critical systems, namely complex Critical Infrastructures (LCCIs), require ef...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
This paper proposes an approach to software faults diagnosis in complex fault tolerant systems, enco...
Dependable complex systems often operate under variable and non-stationary conditions, which require...
International audienceModern component frameworks support continuous deployment and simultaneous exe...
The analysis of monitoring data is extremely valuable for critical computer systems. It allows to ga...
The next generation of software systems in Large-scale Complex Critical Infrastructures (LCCIs) requ...
This work addresses the problem of software fault diagnosis in complex safety critical software syst...