Despite the improvements of the software development and maintenance processes in the last decades, failures at runtime are still inevitable due to the increasing complexity of nowadays systems. A short outage of a critical service could cause catastrophes for both economy and lives. Online failure management is an approach which aims to counteract the problems that occur at runtime so that the system can continue to operate normally. Online failure management includes two main steps. First, the pending problems that are going to occur at runtime need to be predicted and identified before they actually occur. Second, when the failures can be expected, failure avoidance techniques have to be applied to prevent the failures from occurring and...
Malfunction or breakdown of certain mission critical systems (MCSs) may cause losses of life, damage...
Assuring high reliability levels in complex software systems is difficult. The spread of component-b...
Failure prediction is one of the key challenges that have to be mastered for a new arena of fault to...
Online failure prediction is an approach that aims to increase system reliability by predicting pend...
With ever-growing complexity and dynamicity of computer systems, proactive fault management is an ef...
Cyber-physical systems (CPSs) have sophisticated control mechanisms that help achieve optimal system...
Typically, emerging system failures have a strong impact on the performance of industrial systems as...
As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensu...
In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Syste...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
In new product development processes the ramp-up phase is the most critical. For the first time the ...
Online failure prediction approaches aim to predict the manifestation of failures at runtime before ...
My research focuses on both policy and mechanism for managing datacenter-scale installations (thousa...
AbstractIn new product development processes the ramp-up phase is the most critical. For the first t...
The increasing use of online channels for service delivery raises new challenges in service failure ...
Malfunction or breakdown of certain mission critical systems (MCSs) may cause losses of life, damage...
Assuring high reliability levels in complex software systems is difficult. The spread of component-b...
Failure prediction is one of the key challenges that have to be mastered for a new arena of fault to...
Online failure prediction is an approach that aims to increase system reliability by predicting pend...
With ever-growing complexity and dynamicity of computer systems, proactive fault management is an ef...
Cyber-physical systems (CPSs) have sophisticated control mechanisms that help achieve optimal system...
Typically, emerging system failures have a strong impact on the performance of industrial systems as...
As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensu...
In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Syste...
Distributed software systems have become the backbone of Internet services. Failures in pro-duction ...
In new product development processes the ramp-up phase is the most critical. For the first time the ...
Online failure prediction approaches aim to predict the manifestation of failures at runtime before ...
My research focuses on both policy and mechanism for managing datacenter-scale installations (thousa...
AbstractIn new product development processes the ramp-up phase is the most critical. For the first t...
The increasing use of online channels for service delivery raises new challenges in service failure ...
Malfunction or breakdown of certain mission critical systems (MCSs) may cause losses of life, damage...
Assuring high reliability levels in complex software systems is difficult. The spread of component-b...
Failure prediction is one of the key challenges that have to be mastered for a new arena of fault to...