The analysis of monitoring data is extremely valuable for critical computer systems. It allows to gain insights into the failure behavior of a given system under real workload conditions, which is crucial to assure service continuity and downtime reduction. This paper proposes an experimental evaluation of different direct monitoring techniques, namely event logs, assertions, and source code instrumentation, that are widely used in the context of critical industrial systems. We inject 12,733 software faults in a real-world air traffic control (ATC) middleware system with the aim of analyzing the ability of mentioned techniques to produce information in case of failures. Experimental results indicate that each technique is able to cover a li...
This thesis introduces a novel approach to online failure prediction for mission critical distribute...
The ultimate goal of the research presented in this paper is analysis, modeling, and prediction of f...
Event logs are the primary source of data to characterize the dependability behavior of a computing ...
The analysis of monitoring data is extremely valuable for critical computer systems. It allows to ga...
Monitoring is a consolidated practice to characterize the dependability behavior of a software syste...
Event logs are the first place where to find useful information about application failures. Event l...
Error data collected at runtime play a key role for dependability analysis and improvement of softwa...
Error propagation analysis is a consolidated practice to gain insights into error modes and effects ...
Direct Monitoring Dataset is a collection of data obtained during an experimental analysis of differ...
The level of trust on log-based dependability characterization of complex distributed systems, is bi...
Middleware plays a strategic role to reduce development cost and time to market. However, it raises ...
Software faults are recognized to be among the main responsible for system failures in many applicat...
Failure analysis is valuable to dependability engineers because it supports designing effective miti...
Abstract—Field Failure Data Analysis (FFDA) is a widely adopted methodology to characterize the depe...
Event logs have been widely used over the last three decades to analyze the failure behavior of a va...
This thesis introduces a novel approach to online failure prediction for mission critical distribute...
The ultimate goal of the research presented in this paper is analysis, modeling, and prediction of f...
Event logs are the primary source of data to characterize the dependability behavior of a computing ...
The analysis of monitoring data is extremely valuable for critical computer systems. It allows to ga...
Monitoring is a consolidated practice to characterize the dependability behavior of a software syste...
Event logs are the first place where to find useful information about application failures. Event l...
Error data collected at runtime play a key role for dependability analysis and improvement of softwa...
Error propagation analysis is a consolidated practice to gain insights into error modes and effects ...
Direct Monitoring Dataset is a collection of data obtained during an experimental analysis of differ...
The level of trust on log-based dependability characterization of complex distributed systems, is bi...
Middleware plays a strategic role to reduce development cost and time to market. However, it raises ...
Software faults are recognized to be among the main responsible for system failures in many applicat...
Failure analysis is valuable to dependability engineers because it supports designing effective miti...
Abstract—Field Failure Data Analysis (FFDA) is a widely adopted methodology to characterize the depe...
Event logs have been widely used over the last three decades to analyze the failure behavior of a va...
This thesis introduces a novel approach to online failure prediction for mission critical distribute...
The ultimate goal of the research presented in this paper is analysis, modeling, and prediction of f...
Event logs are the primary source of data to characterize the dependability behavior of a computing ...