Online failure prediction approaches aim to predict the manifestation of failures at runtime before the failures actually occur. Existing approaches generally refrain themselves from collecting internal execution data, which can further improve the prediction quality. One reason behind this general trend is the runtime overhead incurred by the measurement instruments that collect the data. Since these approaches are targeted at deployed software systems, excessive runtime overhead is generally undesirable. In this work we conjecture that large cost reductions in collecting internal execution data for online failure prediction may derive from pushing the substantial parts of the data collection work onto the hardware. To test this hypothesis...
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of...
Online service failures in production computing envi-ronments are notoriously difficult to debug. Wh...
Traditionally, performance has been the most important metrics when evaluating a system. However, in...
Online failure prediction aims to predict the manifestation of failures at runtime before the failur...
Online failure prediction is an approach that aims to increase system reliability by predicting pend...
Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2011 – Universitetet i Agder, Grims...
<p>Failures at runtime in complex software systems are inevitable because these systems usually cont...
In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Syste...
We analyze hardware sensor data to predict failures in a high-end com-puter server. Features are ext...
With ever-growing complexity and dynamicity of computer systems, proactive fault management is an ef...
In this paper, we present the Framework for building Failure Prediction Models ((FPM)-P-2), a Machin...
As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensu...
<p>Complex software systems experience failures at runtime even though a lot of effort is put into t...
Raw data of hardware performance counters from benchmarks used in developing the early detection and...
This thesis introduces a novel approach to online failure prediction for mission critical distribute...
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of...
Online service failures in production computing envi-ronments are notoriously difficult to debug. Wh...
Traditionally, performance has been the most important metrics when evaluating a system. However, in...
Online failure prediction aims to predict the manifestation of failures at runtime before the failur...
Online failure prediction is an approach that aims to increase system reliability by predicting pend...
Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2011 – Universitetet i Agder, Grims...
<p>Failures at runtime in complex software systems are inevitable because these systems usually cont...
In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Syste...
We analyze hardware sensor data to predict failures in a high-end com-puter server. Features are ext...
With ever-growing complexity and dynamicity of computer systems, proactive fault management is an ef...
In this paper, we present the Framework for building Failure Prediction Models ((FPM)-P-2), a Machin...
As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensu...
<p>Complex software systems experience failures at runtime even though a lot of effort is put into t...
Raw data of hardware performance counters from benchmarks used in developing the early detection and...
This thesis introduces a novel approach to online failure prediction for mission critical distribute...
Failure prediction is an important aspect of self-aware computing systems. Therefore, a multitude of...
Online service failures in production computing envi-ronments are notoriously difficult to debug. Wh...
Traditionally, performance has been the most important metrics when evaluating a system. However, in...